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ABSTRACT 


Grogan.  Timothy  Alan.  Ph.D.,  Purdue  l  niversitv,  August  1983.  SHARE 
RECOGNITION' AND  DESCRIPTION:  A  COMPARATIVE  STUDY.  Major 
Professor:  O.  Robert  Mitchell. 

An  important  problem  in  the  extraction  of  information  from  images  is 
shape  recognition.  Several  methods  of  analyzing  binary  images  using  global 
shape  methods  based  upon  functional  approximation  have  been  reported  in  the 
literature.  However,  there  has  been  a  lack  of  information  comparing  the 
effectiveness  of  these  methods  in  shape  analysis. 

Five  methods  of  global  shape  analysis  are  compared  on  two  basis.  The 
five  methods  compared  are  1)  Fourier  descriptors  of  the  boundary,  2)  Walsh 
points  of  the  boundary,  3)  the  cumulative  angular  deviant  Fourier  descriptors, 
4)  moments  of  the  silhouette,  and  5)  moments  of  the  boundary.  First,  the 
different  methods  are  introduced,  their  geometric  properties  presented,  and  the 
formulae  for  some  generic  shapes  are  provided.  Then  the  methods  are 
compared  on  the  basis  of  the  empirical  facts  derived  from  a  set  of  aircraft 
shape  recognition  experiments.  The  shapes  are  different  views  of  six  aircraft. 
The  aircraft  silhouettes  are  the  two-dimensional  projected  images  of  three- 
dimensional  rigid  bodies.  The  five  methods  are  ranked  according  to  their 
performance  from  these  experiments. 

A  new  method  for  the  recognition  of  partial  shapes  based  on  the  Fourier- 

.  * 

Mellin  transform  is  introduced.  A  shift  and  scale  invariant  correlation  of  the 
complete  and  partial  shape’s  curvature  functions  is  obtained  by  applying  the 


Mellin  transform  to  the  magnitude  of  the  Fourier  transform  of  the  curvature 
functions  The  logarithm  of  the  shift  in  the  correlation  function  corresponds  to 
the  time  scaling  of  the  partial  shape’s  curvature  function.  Then  the  shift  in  an 
ordinary  correlation  of  the  complete  shape’s  curvature  and  the  scaled  partial 
shape’s  curvature  is  the  time  shift  necessary  to  complete  the  alignment.  Then 
a  pointwise  comparison  of  the  curvature  functions  can  be  made  to  determine 
matching  and  non-matching  contour  segments.  Some  initial  recognition 
experiments  for  partial  shapes  are  carried  out  and  the  results  reported. 


CHAPTER  1 


INTRODUCTION 

1.1  Shape:  An  Introduction 

The  function  of  many  information  processing  systems  is  to  either  extract 
information  from  a  signal  or  to  generate  a  signal  from  the  realization  of 
algorithms  which  contain  encoded  information.  This  shape  information  can 
then  be  used  to  describe  a  known  shape  or  to  recognize  (detect  or  classify)  an 
unknown  shape.  This  report  compares  several  methods  of  representing  shape 
information.  In  particular,  the  shapes  dealt  with  herein  will  be  two- 
dimensional  regions  in  the  plane  that  are  simply  connected.  The  representation 
of  a  three-dimensional  object  (a  simply  connected  region)  will  also  be  discussed 
by  way  of  its  projection  onto  the  plane. 

The  objective  then  is  to  characterize  the  shape  by  extracting  measured 
quantities.  This  might  appear  to  be  very  straight  forward,  but  shape  is  an 
allusive  quality  to  quantify.  Shape  can  be  contrasted  with  texture.  Texture  is 
also  difficult  to  define.  Texture  is  the  random  or  chaotic  non-form  of  a  signal 
and  shape  is  the  structure  or  form  of  the  signal.  These  definitions  however 
have  a  fuzzy  boundary  between  them. 


1.2  Introduction  to  Shape  Analysis  Methods 

In  the  study  of  the  architecture  of  neurons  in  the  brain  and  sensory 
nerves,  it  has  been  suggested  [P1TT47]  that  humans  perceive  shape  by 
performing  comparisons  to  forms  already  stored  in  the  brain.  This  comparison 
is  performed  invariant  of  various  transformations  on  the  shape  such  as  scaling 
of  size,  translation,  and  motion.  The  structure  of  the  neurons  implement 
operations  that  are  invariant  over  these  groups  of  transformations  [HOFF78J. 
Others  have  suggested  that  a  shape  is  perceived  when  neural  signals  are 
transformed  until  they  reach  a  stable  signal  or  the  neurons  reach  a  stable 
condition  (move  to  a  strange  attractor[THOM75|).  It  is  at  that  moment  that 
the  shape  is  recognized.  To  actually  implement  this  kind  of  algorithm  on  a 
computing  machine  will  require  massive  parallelism.  Some  preliminary 
investigations  into  parallel  implementation  of  shape  algorithms  have  been 
shown  to  have  increased  effectiveness  [TUOM83]. 

Some  of  the  early  methods  in  shape  analysis  were  based  on  extracting 
gross  parameters  of  the  shape.  For  instance,  topological  invariant  properties 
such  as  the  number  of  connected  components,  the  number  of  holes,  Euler 
number,  etc.  were  tabulated  for  a  binary  image  [ROSE70].  Methods  first  used 
in  biology,  for  instance,  the  statistics  of  random  cord  intersections  and 
tangents,  have  their  theoretical  basis  in  integral  geometry  and  the  theory  of 
random  sets  [SANT76].  Also,  aspect  ratio,  area,  and  perimeter  have  been  used 
[PAVL78].  Although  robust  in  nature,  these  methods  lack  the  ability  to 
characterize  fine  detail  in  and  among  shapes.  The  most  recent  methods  have 
attempted  to  provide  this  necessary  discriminating  ability. 

Others  investigators  [FU82]  have  used  formal  languages  with  sets  of  rules 
(syntax)  to  describe  a  structure  of  a  signal.  Syntactic  methods  represent  local 
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characteristics  of  a  shape  well.  However,  this  approach  begins  having  dillicult  v 
in  generating  the  necessary  rules  in  an  environment  containing  many  shapes  or 
where  one  of  many  views  of  a  three-dimensional  shape  are  possible.  Also,  these 
methods  still  rely  on  some  other  operation  to  first  extract  an  ordered  list  of 
shape  primitives.  These  shape  primitives  or  lists  of  primitives  are  the  words 
and  sentences  processed  to  determine  if  an  allowed  sentence  (a  possible  shape) 
has  the  proper  syntax  (i.e.  has  a  desired  shape.) 

Some  stochastic  models  have  been  developed  for  shape  analysis  [KAS1I81]. 
These  methods  seem  to  be  especially  useful  for  modeling  broad  classes  of 
shapes. 

A  very  different  approach  is  to  instead  measure  how  much  of  the  unknown 
signal  is  contained  in  each  of  several  basis  signals.  A  list  of  numbers  is  used  to 
characterize  the  shape  of  the  signal(region).  If  shape  recognition  (  detection  or 
classification)  is  desired,  then  instead  of  matching  them  directly,  a 
characterization  is  used  that  is  convenient  for  the  operations  needed  in  lining 
them  up  properly  prior  to  the  matching  process.  Then  it  can  be  determined  if 
they  match  by  simply  measuring  the  distance  between  the  two 
characterizations.  The  characterization  is  obtained  by  projecting  the  signals 
onto  the  basis  signals.  This  is  done  for  each  basis  signal  and  the  list  of  values 
obtained  constitutes  the  characterization  of  the  shape.  It  is  desirable  if  the 
basis  signals  chosen  are  sufficiently  different  in  order  to  extract  the  greatest 
possible  distinctive  or  independent  shape  quantities. 

The  theory  of  functional  approximation  and  linear  vector  spaces 
(KOLM75)  provides  exactly  the  formalism  necessary  to  accomplish  such  a  task. 
The  methods  of  shape  analysis  compared  in  this  report  differ  mainly  in  what 
sort  of  basis  signal  are  used  to  characterize  a  shape.  The  measurement  of 


quantities  used  for  signal  recognition  (detection  or  classification)  and 
description  is  termed  feature  extraction. 

1.3  Categories  of  Shape  Analysis  Methods 

Shape  analysis  methods  can  be  assigned  to  one  of  several  categories.  One 
possible  distinction  can  be  made  on  the  basis  of  whether  local  or  global  context 
is  used  in  extracting  the  shape  information.  With  a  local  shape  analysis 
method  only  a  subset  of  the  elements  in  the  characterization  is  alfected  by 
information  in  a  local  neighborhood  on  the  shape.  However,  with  a  global 
method  almost  every  element  in  the  characterization  is  affected  by  information 
in  a  local  neighborhood. 

Another  partition  can  be  made  based  on  how  much  of  the  shape  must  be 
available  for  the  method  to  still  perform  adequately.  Complete  shape  analysis 
methods  need  the  entire  shape  present  to  perform  well.  Partial  shape  methods 
are  designed  to  perform  well  even  when  some  of  the  shape  is  either  missing  or 
severely  corrupted  by  noise.  All  but  one  of  the  methods  discussed  and 
compared  in  this  report  are  global  methods.  However,  the  performance  of  these 
methods  using  partial  shapes  is  also  included.  In  Chapter  8,  a  new  method  for 
partial  shape  recognition  based  on  the  Fourier-Mellin  transform  is 
presented  [GR  OG83) . 

Shape  methods  can  also  be  categorized  on  the  basis  of  whether  the  interior 
of  the  object  or  only  the  exterior  boundary  of  the  object  is  used  in  forming  the 
characterization[PAVL78j.  An  external  scalar  shape  method  analyzes  the 
essentially  one-dimensional  function  describing  the  boundary  of  the  object. 
Internal  spatial  analysis  methods  operate  on  the  spatial  distribution  of  the 
object’s  interior  elements. 
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The  main  goal  of  this  report  is  to  compare  five  methods  of  complete  shape 
analysis.  Each  method  will  be  discussed  in  detail  in  a  separate  chapter.  The 
methods  will  be  compared  by  first  providing  their  defining  relationships  and 
listing  many  of  the  properties  that  pertain  to  shape  recognition  and  description. 
Also  provided  are  generic  shapes  with  their  corresponding  representations. 
Lastly,  empirical  results  from  several  large  recognition  experiments  are 
presented.  The  combined  results  from  all  five  of  the  shape  methods  are  then 
discussed  in  a  following  chapter.  In  the  chapter  immediately  following  is  a 
discussion  of  the  recognition  system  used  and  experiments  carried  out  to 
obtain  the  empirical  recognition  results. 
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CHAPTER  2 

THE  RECOGNITION  SYSTEM 

2.1  Recognition  System  Overview 

Information  processing  systems  are  often  designed  to  sense  the 
environment.  Signals  are  collected,  information  contained  in  patterns  of  the 
signal  are  extracted,  and  then  an  appropriate  response  is  issued  based  on  that 
information.  A  general  system  for  pattern  recognition  is  depicted  in  Figure  2.1. 
Let's  briefly  discuss  the  operations  that  take  place  as  patterns  are  processed  by 
such  a  system. 

Information  about  the  environment  is  acquired  by  collecting  energy.  This 
collection  is  accomplished  by  a  device  (collector)  such  as  an  antenna,  a  lens, 
microphone,  etc.,  and  is  ordinarily  distributed  in  space  and/or  time.  A 
transducer  converts  the  gathered  energy  into  a  form  suitable  for  processing  by 
the  recognition  system.  The  collector  and  transducer  together  are  often  called 
the  sensor.  The  pattern  is  then  presented  to  the  the  preprocessing  functional 
block.  This  block  usually  contains  filters  which  either  enhance  or  restore  the 
signal.  Some  noise  reduction  is  often  attempted.  The  segmentation  block 
separates  or  labels  portions  of  the  incoming  signal.  This  block  is  often  a  simple 
thresholding  operation,  but  can  be  much  more  sophisticated  such  as  a  pixel 
classifier  or  labeling  of  the  signal  components  as  a  result  of  a  feature  clustering 
algorithm.  The  basic  idea  is  to  discriminate  among  portions  of  the  signal.  The 
segmentation  functional  block  is  often  the  most  dillicult  block  to  design.  This 
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functional  component  is  very  much  like  the  pattern  recognizer  to  follow.  A 
tradeoff  between  the  complexity  of  the  segmentation  and  the  remaining  portion 
of  the  pattern  recognition  system  is  often  the  main  source  of  engineering 
difficulties  in  implementing  a  system  which  can  process  signals  from  a  real 
world  environment.  Next  is  feature  extraction.  This  is  the  functional 
component  where  the  shape  features  are  extracted  when  using  a  shape  analysis 
method.  Comparing  the  different  shape  methods  mainly  means  changing  Ibis 
functional  component  with  another  employing  a  different  shape  method.  This 
component  can  also  be  the  functional  unit  that  provides  the  primitives  needed 
when  using  a  syntactic  method.  Often  following  feature  extraction  is  feature 
reduction.  This  is  employed  to  eliminate  redundancy  in  the  extracted  features 
or  to  simply  prune  down  the  number  of  features  to  a  manageable  size.  The 
methods  often  used  are  linear  transformations  including  the  principal 
components  transformation.  The  feature  recognition  functional  element 
following  can  be  simple  template  matching  using  an  error  norm  followed  by  a 
ranking  of  the  errors  (K  weighted  nearest  neighbors),  a  syntactic 
analyzer(parser),  or  another  decision  function  based  on  a  probabilistic  model. 
The  feature  classifies  often  calls  upon  the  knowledge  base  for  information  such 
as  templates,  productions  rules(grammars),  statistical  parameters,  or  a  model 
with  its  associated  parameters. 


2.2  Recognition  System  Simulation 

Now  the  simulation  of  the  recognition  system  used  to  obtained  the  results 
contained  in  this  report  will  be  discussed.  Shapes  for  investigating  the 
recognition  system's  performance  are  simulated  rather  than  using  actual  sensor 
data  in  order  to  obtain  more  control  over  the  experiments.  Silhouettes  of 
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three-dimensional  objects  are  obtained  using  a  computer  graphics  program. 
Models  of  military  aircraft  are  formed  by  the  union  of  cylindric  and  rhombic- 
prisms  whose  size  and  orientation  have  been  specified.  The  faces  of  these 
prisms  are  specified  by  the  list  of  their  vertices.  The  vertices  are  first  rotated 
in  order  to  obtain  the  silhouette  of  the  object  corresponding  to  viewing  from  a 
particular  aspect  angle,  (<?x  <}>y,  o2).  (See  Figure  2.2.)  Then  the  object, 
0(x,y,z),  is  parallel  projected  (P  operator)  onto  the  (x' ,  y' )  image  plane.  So, 
corresponding  to  a  particular  aspect  angle  (4>x,  <>y,  <£>z)  is  the  image 
I(x',  y'  )  =  f'jofx.y^)  :  (0X,  )j.  To  simulate  the  projection  operation 

an  array  of  individual  parallel  rays  are  '‘fired”  at  the  model.  Where  a  ray 
intersects  the  object  an  image  picture  element  (pixel)  is  set  to  one  and  where 
there  is  no  intersection  the  pixel  is  assigned  a  value  of  zero.  So.  in  this  case  we 
have  a  discrete  image, 

li;  =  I(  iAP.  jA(/)  =  P  (o(x,y,z)  :  ( 0X,  <f)y,  <pz),  A(/J, 


where  A(/  is  the  spacing  between  rays. 

No  attempt  is  made  to  perform  any  noise  removal  or  any  other 
preprocessing.  In  other  words,  the  preprocessing  component  is  not  simulated. 

In  one  set  of  experiments,  noise  is  added  to  the  two  level  image  in  order  to 
simulate  a  real  world  situation.  The  discrete  noise  image  ,  n-,  is  added  to  the 
discrete  projected  silhouette  image  to  obtain  1^  =  I,:  +  n,j.  This  corrupted 
image  is  then  thresholded,  i.e. 


i,  if  tj>' 

0,  if  tj  <  t ' 


where  t  =  threshold. 
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In  order  to  reduce  storage  re(iuirenients,  instead  of  retaining  tlie  object 
silhouette  image,  only  the  boundary  is  stored.  The  boundary  information  is 
stored  in  the  form  of  an  eight  direction  chain  code.  The  chain  code  is  a  list  of 
eight  valued  numbers  that  describe  the  direction  to  take  to  get  from  the 
present  position  to  the  next  as  the  object  is  traced  counterclockwise  The 
chain  code  is  the  ordered  list,  V :  a0a,  •  •  aK  ,,  where  a;  =  the  direction  code 

specifying  the  direction  to  the  next  image  point  along  the  contour,  and  K  = 
length  of  the  chain.  The  directions  of  the  chain  code  are  shown  in  Figure  2.3. 

Each  different  shape  method  generates  a  feature  vector  _f  =  jf0  . . .  fnJ 

which  is  a  characterization  of  either  the  boundary  function  or  the  image  itself. 

Let  f(x)  be  the  function  to  be  approximated,  then  f(x)  =  £)f;p;(x),  and 

i 

fj  =  /f(x)0;(x)  dx,  where  0(x)  are  the  basis  functions  of  the  method  and  {0;(x)} 
is  the  reciprocal  basis  for  {0,(x)}.  For  later  processing,  the  set  {f;}  is  usually 
truncated  to  a  finite  set  of  elements  K,  and  reordered  to  form  the  feature 
vector  £-  |f0...fKlJ  ■  The  coefficients,  f;,  are  used  in  forming  the 
characterization. 

Further  reduction  of  the  feature  vector  is  performed  by  applying  a 
principle  components  transformation  followed  by  a  truncation  of  the  elements 
in  the  feature  vector.  The  principle  components  transformation  is  derived  by 
combining  the  feature  vectors  from  all  the  libraries  and  calculating  the  sample 
covariance  matrix.  That  is 

|  i  i  _ 

'-i, 

N  t=l 


where 
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^ij  —  (i,j)lh  element  of  the  covariance  matrix 

_  j  |NJ' 

f;  =  — -  J]  ^  j  =  mean  value  of  the  itf*  feature. 

N  11=1 

N'  —  number  of  feature  vectors  in  the  combined  libraries. 

f^  i  —  ith  feature  of  the  0th  feature  vector  in  the  combined  library. 

S  is  a  symmetric  matrix.  The  principle  components  transformation  then  is  the 
matrix  of  eigenvectors  of  the  covariance  ordered  by  eigenvalue  from  the  largest 
to  the  smallest,  i.e.  { P ]T[  5^  ] ( P J  =  A,  where  A  is  a  diagonal  matrix  with  the 
eigenvalues  along  the  diagonal  and  P  is  the  matrix  of  eigenvectors.  Letting  Pf 
be  the  matrix  P  with  its  rows  reordered  by  eigenvalue  (largest  to  smallest)  and 
zero  rows  for  the  jth  rows,  j  >  K' ,  then  the  reduced  feature  vector  is  £  =  P'  L 
In  general,  the  reduced  feature  vectors  in  the  experiments  consist  of  twelve  (12) 
elements. 

Each  individual  airplane  shape  feature  library  is  transformed  into  a 
reduced  feature  vector  library.  These  six  libraries  for  the  six  different  military 
aircraft  make  up  the  knowledge  base  or  templates  used  by  the  classifier. 

Feature  vectors  from  unknown  shapes  are  then  classified  using  a  nearest 
neighbor  template  matching.  The  unknown  feature  is  compared  to  every 
feature  vector  in  the  shape  libraries.  The  unknown  shape  is  then  labeled  as 
belonging  to  the  aircraft  containing  the  library  feature  vector,  P,  having  the 
smallest  distance  to  the  unknown  feature  vector,  r.  The  distance  is  measured 
as  the  sum  of  the  square  of  the  difference  between  each  feature  vector  element 
in  the  feature  vector,  i.e. 

4,  =  Eif"i  -  hr- , 

i=  1 

where  fj1'1  -  denotes  the  ilh  element  in  the  jlh  feature  vector  of  the  Llh  library. 
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2.3  Scheme  of  Experiments 

Five  basic  classification  experiments  were  designed  to  test  the  performance 
of  the  shape  analysis  methods  under  varying  conditions.  Following  is  a 
description  of  each  experiment.  Each  of  these  experiments  was  carried  out  for 
each  for  the  five  complete  shape  analysis  methods.  The  results  of  these 
experiments  are  reported  in  each  of  the  chapters  describing  that  particular 
method.  In  chapter  8  is  a  comparison  of  the  results  of  these  experiments  for 
the  differing  shape  methods. 

The  shapes  used  for  data  in  the  experiments  are  two-dimensional 
projections  of  three-dimensional  models  of  six  different  military  aircraft.  They 
are  the  Mig,  B57,  Phantom,  F104,  F105,  and  the  Mirage.  The  shape  libraries 
are  the  sets  of  feature  vectors  derived  from  multiple  views  of  each  of  these  six 
aircraft.  Figures  2.4  to  2.9  show  each  of  the  six  military  aircraft  shape  libraries 
consisting  of  143  (13x11)  views  at  an  image  resolution  of  256x256  pixels.  The 
viewing  angles  (in  degrees)  used  to  generate  the  libraries  are  pairs  of  <f>x  and  <f>y 
where 
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Fifty  (50)  random  views  not  already  contained  in  the  libraries  are 
generated  for  each  of  the  six  aircraft  (6x50=300  unknowns).  These  unknown 
shapes  are  used  as  test  patterns  to  be  classified. 
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Figure  2.4  Mig  aircraft  shape  library  of  143  (13x11)  views,  256x256  image 
resolution. 


j  i  j  j/^w  \  i  i 


^  ^  ^  ^  ^ 


^  ^  V  V 


^  ^  ^5» 

1  ?  $  'v3  “v3  'v5  “v5  'v'  ‘v’  ? 

"N^  *^5^  ^3  *=^=>  ^  ^ 


'^i,  "^5^  S^i 


«<J5  c^  ^*<r  ^s*° 


-Jfc-  -^y  ^s>  Co  Co 

•*5=^  .<£=£*  <5=^  4S^  -A-  -£>  CtSO 

^=^J’  ^  '«&’’  ^5?  ^  ^ 

^  '&-$7  §x  %>.  %>. 

1  D  $  ^  ^  ^  ^  ^  $  tf 

,!\7'v'^  V?  C?  /?* 

c=-^  r^  ■gcl  ^7  -<^  ^  ££3  ^=3 


■cso,  "CO.  -^?  -«5»-  'W  -csr 


■p=»  "W 


Figure  2.6  Phantom  aircraft  shape  library  of  143  (13x11)  views,  256x256 
image  resolution. 
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F104  aircraft  shape  library  of  143  (13x11)  views,  256x256  image 
resolution. 
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Figure  2.8  F105  aircraft  shape  library  of  143  (13x11)  views,  256x256  image 

resolution. 
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Figure  2.9  Mirage  aircraft  shape  library 
image  resolution. 


of  143  (13x11)  views,  256x256 
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2.3.1  Image  Resolution  Experiment 

This  experiment  was  designed  to  test  the  shape  methods  as  noise  was 
introduced  as  a  result  of  varying  image  spatial  resolution.  The  spacing  of  the 
rays  (Al)  varies  the  image  quality  and  thereby  the  quality  of  the  projected 
shape.  Both  libraries  and  the  unknowns  were  generated  with  pixel  arrays 
resulting  in  images  with  16x16  to  512x512  pixels  per  view.  The  same  number 
of  elements  in  the  feature  vector  was  used  throughout  this  experiment.  Each 
aircraft  library  consisted  of  143  views  corresponding  to  an  13x11  array  of 
aspect  angles  covering  a  hemisphere.  Figures  2.10  to  2.15  show  samples  from 
the  unknown  shape  test  sets  at  each  of  the  six  different  image  resolutions. 

2.3.2  Feature  Vector  Experiment 

This  experiment  was  designed  to  provide  insight  into  how  compactly  the 
shape  information  was  represented  by  the  feature  vector.  With  a  fixed  number 
of  views  in  each  of  the  aircraft  libraries(143)  and  with  the  libraries  generated  at 
a  fixed  image  resolution  (256x256),  the  feature  vector  length  was  gradually 
increased  for  both  the  libraries  and  the  unknowns.  The  unknown  image 
resolution  was  also  varied  to  provide  additional  information  on  how  well  the 
feature  vectors  characterize  the  shape.  The  experiment  was  carried  out  using 
the  full  feature  set  (in  its  varying  lengths)  and  also  when  the  feature  extraction 
is  followed  by  the  principle  components  feature  reduction  resulting  in  a  reduced 
length  feature  vector  (12  real  numbers.) 


2.3.3  Library  Projection  Experiment 

This  experiment  was  designed  to  show  variations  in  classification  accuracy 
as  the  number  of  views  in  the  shape  libraries  was  decreased.  This  tested  the 
different  methods  ability  to  properly  classify  a  three-dimensional  object  from  a 
limited  number  of  views.  It  also  tested  how  well  the  method  could  discriminate 
between  very  similar  shapes  that  belong  in  different  aircraft  libraries.  This 
occurs  when  there  are  many  views  from  each  aircraft.  These  are  opposing 
effects  however.  So,  to  obtain  more  definitive  results  about  these  two  effects 
would  require  an  independent  method  of  measuring  the  similarity  among 
shapes.  This  however  was  not  available.  Others  have  investigated  methods  of 
reducing  the  library  storage  requirements  [CHAR81,  GLEN82].  Results  from 
these  investigations  imply  that  only  a  small  number  of  views  are  necessary  to 
represent  the  shape  adequately  for  the  purposes  of  shape  recognition.  Figures 
2.16  and  2.17  show  the  F104  libraries  at  an  image  resolution  of  256x256  with  9 
(3x3)  and  49  (7x7)  views  per  library,  respectively.  The  viewing  angles  (in 
degrees)  used  to  generate  the  9  view  (3x3)  libraries  are  pairs  of  <j>x  and  <j>y  where 

<j>x:  8.0  90.0  171.9 

and 

4>y:  -72.2  0.0  72.2 

The  viewing  angles  (in  degrees)  used  to  generate  the  49  view  (7x7)  libraries  are 
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pairs  of  ox  and  <py  where 

4>x:  2.0  8.0  31.5  90.0  148.4  171.9 

and 

<(>r  -90.0  -53.9  -17.8  0.0  17.8  53.9  90. 


2.3.4  Imaging  Noise  Experiment 


This  experiment  was  designed  to  test  the  performance  of  the  shape 
methods  when  the  shapes  have  been  perturbed  by  imaging  noise.  To  simulate 
this  efTect,  50  unknown  views  of  each  of  the  six  aircraft  were  generated.  The 
images  consisted  of  a  background  level,  gb,  of  96  and  a  level,  g0,  of  160  when 
the  pixel  was  contained  in  the  view  of  the  three-  dimensional  aircraft  two- 
dimensional  projection.  Then  zero  mean,  white,  Gaussian  noise,  n;j,  was  added 
to  the  image,  I;j.  The  image  was  then  thresholded  at  128.  So,  Ijj  =  I;j  +  n^, 
where  Efr^j  -  0,  E(n;j  nkmJ  =  cr  6(i_k)(j_m)  and 

n  _  x2 

Prob[n|j  <  n]  =  /  ■  - e  2a  dx.  The  image  was  then  thresholded  at  128. 

~oo  v  2tt<t 


So, 


i\; 


1,  if  iij  >  128 
[o,  if  <  128. 


The  contours  of  the  resulting  regions  were  traced  and  the  chain  code  for  the 
region  having  the  longest  chain  code  was  retained  to  represent  that  unknown. 
These  unknowns  were  then  classified  and  the  results  tabulated  for  each 
method  The  same  operation  was  carried  out  on  the  six  sets  of  50  unknowns 
for  four  different  noise  levels  corresponding  to  signal-to-noise  ratios  of  3,  6,  10, 


Figure  2.17  F104  library  of  49  (7x7)  views,  256x256  image  resolution. 
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and  20  dB.  The  signal-to-noise  ratio  here  is  defined  as 

SNR  =20  log(  — ), 

<7 

where  A  =  g0  -  gb,  and  a  -  standard  deviation  of  the  white  Gaussian  noise 
process.  (A  =  64  and  a  —  40.3,  32.08  ,  20.02,  and  6.4.)  This  same  procedure 
was  repeated  so  that  noisy  unknown  shapes  were  generated  at  image 
resolutions  of  64x64  and  128x128  image  resolutions.  Figures  2.18  to  2.27  depict 
images  of  an  F104  at  each  of  the  five  signal-to-noise  ratios  and  their 
corresponding  contours.  The  libraries  were  at  an  image  resolution  of  256x256 
and  contained  143  views  per  library. 

2.3.5  Partial  Shape  Experiment 

This  experiment  was  designed  to  test  the  performance  of  the  shape 
methods  in  classifying  partial  shapes.  Partial  shapes  were  generated  by  taking 
the  50  unknowns  for  each  aircraft  and  chopping  from  10%  to  50%  of  the 
contour.  The  chopped  out  portion  was  replaced  with  a  straight  edge,  a  60 
turn,  and  another  straight  edge  to  close  the  contour.  These  unknowns  were 
processed  to  extract  their  shape  feature  vectors.  Each  feature  vector  was 
reduced  and  classified.  This  experiment  was  performed  with  a  constant  feature 
vector  length  and  with  the  libraries  at  a  fixed  image -resolution  (256x256)  and 
number  of  views  in  each  library  (143).  The  unknowns  were  all  at  an  image 
resolution  of  128x128.  Figures  2.28  to  2.33  show  examples  of  partial  shapes  at 
each  of  0%,  10%,  20%  30%,  40%  and  50%  chopped. 


Figure  2.18  Noise  free  F104  silhouette,  128x128  image  resolution. 
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Figure  2.24  F104  image  with  noise  added  (SNR  =  10  dB),  128x128  image 

resolution. 
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CHAPTER  3 

FOURIER  DESCRIPTORS  OF  THE  BOUNDARY 

3.1  Introduction 

The  first  method  to  be  discussed  is  a  based  on  the  well  known  method  of 
Fourier  series  analysis  for  functional  approximation.  Having  been  provided 
with  the  silhouette  of  an  object,  the  contour  or  boundary  completely  specifies 
the  two-dimensional  shape.  The  contour  can  be  parameterized  as  a  function  of 
time  by  tracing  around  the  boundary  in  counterclockwise  direction.  (See 
Figure  3.1.)  As  the  tracing  continues  the  function  begins  to  repeat.  Since  this 
function  is  periodic,  it  can  be  expanded  into  a  Fourier  series.  Each  basis 
function  of  the  complex  exponential  Fourier  series  is  non-zero  almost 
everywhere  over  each  period.  So,  this  is  a  global  method  of  shape  analysis. 

The  boundary  function,  7,  maps  the  real  number  line,lR,  into  the  complex 
plane,  C.  The  projection  on  the  real  axis  is  the  x  component  and  the 
orthogonal  projection  on  the  imaginary  axis  is  the  y  component. 

7:  1R  -  C 

7(t)  =  ( x(t), y(t) )  C  IRxIR 
7(t)  =  x(t)  +  iy(t)  C  C. 

(j  /v/ 

The  velocity  of  the  tracing  is  v(t)  =  ~j~L-  So,  the  speed  of  tracing  is 

ai 

equal  to  the  magnitude  of  the  velocity,  i.e. 


vm  I  2  =  d7(t)  .d,Tf*(t}. 
11 '  dt  dt  ‘ 


If  we  assume  that  the  tracing  takes  place  at  a  constant,  speed  (uniform  tracing ), 
then 


v  = 


=  constant. 


Since  the  speed  is  constant  and  the  period  of  the  trace  is  T,  then  T  =  vL, 
where  L  is  the  total  arc  length  once  around  the  contour.  By  definition,  the  arc 
length  is 

T 

L  =  /  |  7(t)  |  dt  <  oo 
o 


Assuming  7(t)  is  a  continuous,  bounded,  periodic  function  with  finite  arc  length 
(i.e.  rectifiable),  ^(t)  can  be  expanded  in  the  Fourier  series 
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3.2  Properties 

A  great  deal  is  known  about  the  properties  of  the  Fourier  series.  These 
properties  can  be  exploited  to  provide  some  insight  into  the  manner  in  which 
the  Fourier  series  coefficients  represent  shape.  Following  is  a  list  of  many 
properties  of  the  Fourier  coefficients  [GRAN72,RICH74,PERS77].  The 
properties  will  be  used  later  to  facilitate  the  manipulation  of  the  Fourier 
coefficients  for  shape  recognition.  Several  of  the  properties  also  apply  to  the 
shape  description  problem. 


.  n  . 

i  —  t 

1)  Completeness:  The  Fourier  basis  functions  e  T  form  an  orthogonal 
basis,  complete  in  C[a,  b]  ( or  L2|a,  b] ). 

2)  Convergence:  Since  the  contour  function  7  is  a  bounded  rectifiable  Jordan 
simply  connected  closed  contour  (SCO),  the  Fourier  series  converges. 

3)  Continuity.  Since  7(t)  is  everywhere  continuous  (SCC)  and  its  first  is 
continuous  almost  everywhere,  the  modulus  of  the  coefficients  are  of  order 
1/n2,  i.e.  |  cn|  =  o(l/n2). 

4)  Translation:  If  an  object  is  translated  m  the  plane  by  Z0  =  x0  +  iy0,  then 
Y  =  7(t)  +  Zo  which  implies  that 

c'  0  =  c0  4-  Z0 
c'n  =  cn>  n*0. 


(See  Figure  3.2.) 
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5)  Rotation:  Rotation  of  the  shape  through  an  angle  a  about  the  origin 
implies  that  c' the  coefficients  of  the  rotated  object  have  a  constant 
phase  added.  That  is 

Y(t)  =  7(t)e'"  =*  c'n  =  cne'",  \/n. 

(See  Figure  3.3.) 

Proof: 

x'  —  x  cosa  -  y  sina 
y'  =  xsina  +  ycosa 


So, 


V  =  x'  +  iy'  =  (x  +  iy  )e" 


=  'ye1" 


Hence, 


T  ■  2irn  T  .  2  n  n  t 

c'n  =  i’/'/(t)e  T  dt  =  ~/'y(t)el®e  T  dt 
1  o  1  o 


v  n  c  • 


n 


6)  Dilation  (scale):  If  the  object  7  is  dilated  by  a  factor  X,  then 

V(t)  =  X7(t)  =♦  c'n  =  Xcn,  Y/n. 


(See  Figure  3.4.) 


7)  Starting  point  shift:  Shifting  the  starting  point  counterclockwise  along  the 
contour,  corresponds  to  beginning  at  the  point  in  time,  “t0”,  on  the 
original  trace.  (See  Figure  3.5.)  So, 

j  2>rn 

V (t)  =  +t0)  -►  c'n  =  cne  T  ,\fa. 


Proof: 


c*  — 

v  n 


2 irn 

10)  e  T  dt. 

1  o 


Let  u  =  t  +  t 


o> 


to  +  T  .  2  ?rn  /  .  > 

,  _  1  r  ,  x  -i—  (u-to)  _ 

=  n  “  7JT  /  1f(u)e  du  - 


1^:T  i^ia.  to 

—  /  7(u)e  T  due  T 

1  to 


.  2irn 
i  "  _  to 

c'n  =  C„e  ,  Vn. 


n 


8)  N-fold  Rotational  Symmetry:  If  a  shape  exhibits  N-foId  rotational 
symmetry,  then  rotating  the  object  by  an  integral  multiple  of  the  angle 

2  Jr 

a  =  about  its  center,  c0,  and  moving  the  starting  point  clockwise  by 
kT 

-j^-,  we  have  the  same  trace.  (See  Figure  3.6.)  Hence, 


7(t)-o0  =  to(t- 


kT 

N 


)-c0) 


This  implies  that 


Figure  3.6  N-fold  symmetry  (N  =8). 
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Figure  3.7  Axial  symmetry. 


(1-e  N  )cn  =0,  Vh,  n*0,  k  =0,1,..., N-l . 

which  implies  that 

cn  A  0,  only  if  (n  - 1)  mod  N  =  0. 

and 

cn  =  0,  otherwise,  except  c0  can  be  anything. 


9)  Axial  Symmetry:  For  a  shape  which  is  symmetric  about  its  center  (Figure 

T 

3.7),  c0,  then  ^t)  -  c0  =  -  ^t  +  — - )-c0  .  Which  implies  that 
cn  *  0,  only  if  n  is  odd 
cn  =  0,  otherwise,  except 
c0  =  anything. 

Proof: 

X 

cn  =  ’^r/'/(t)e  T  dt  +  4r/i<t)e  T  dt. 
i  o  1  X 

2 


T 

Letting  X  =  t-  — , 
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10)  Bilateral  Symmetry:  If  7  is  symmetric  about  the  real  axis  (see  Figure  3.8), 
then  for  some  t0  £  [0,  T],  7(t)  =  7(t  + 10)  =  7*(~t)  implies  that  cn  are 
purely  real. 

Proof: 

Since  7  is  continuous  and  symmetric  about  the  real  axis,  then  it  must 
cross  the  real  axis.  Let  t0  be  the  shift  necessary  so  that 

7(t)  =  7*(-t)  =  7(t  +  t0). 

T  _•  2irn  T  _j  2irn  t 

cn  =  7fr/^(t)  e  T  dt  =  TfjYHJe  T  dt. 

1  0  1  0 


11)  Reflections:  Let  V  be  an  object  which  is  the  reflection  of  7  about  the  real 
(x)  axis  (see  Figure  3.9.)  Then 

V(t)  =  'f(-t)  =►  c'n  =  c*n 

Proof: 

T  _J  2irn  t  T  _.  2*n 

c'n  =  *4r/V  (<-)  e  T  dt  =  — r/Tr*(  -t)  e  T  dt. 

1  0  1  0 

-T  j  2nn  y  0  .  2>rn  ^ 

=  -^r/7*(X)e  T  dx  =  ^r/7*(X)e‘  T  dX 
1  0  1  -T 


n 

12)  Area:  For  a  simply  connected  closed  curve  7,  bounding  the  region,  R, 

Area  =  //dx  dy. 

R 


Using  Green’s  Theorem, 
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13)  Line  shape:  A  line  shape  (see  Figure  3.10)  is  one  with  no  interior  such  that 

T  T 

after  some  time - a(0<a< — ),  the  contour  begins  to  exactly  retrace 

2  2 

but  going  in  the  opposite  direction.  That  is 

+  (“«))  =  T(“t  +  y). 


which  implies  that 

.  nirof 

_  1  T 

c-n  -  cne  . 

And  if  the  starting  point  is  an  endpoint  of  the  line  shape,  a  =  0.  In  that 
case 

c-n  =  V 

Note  that  a  line  shape  can  have  only  support  at  most  2-fold  symmetry. 
Also  since  |  c_n|  =  |  cn| ,  then 

OO 

Area  =  X>[|cn|2-  |c-„|2l  =  0. 

n  =  l 


Proof: 

Given  7(t)  =  ^(T-a-t), 


X_  «. 

2  2  •  2  ?r  n 


T-A 


=  v  /  i<‘)' 


i  _  t 

T 


a 

2 


-  ■  2>rn 

dt  +  y  /  7(t)e  T  dt 

1  X_  «. 

2  2 


T  a 


T-f 


2  2  ■  2  n  t 

=  4r  /  T(t)e  T  dt  +  ~  /  MT-a-t). 


-ilin  t 
(  T  dt 


a_ 

2 


X_ 

2  2 


Let  X  =  T-o -t, 


»  -  *  -  — * _ 
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_T  a.  X_*. 

2  2  •  2im  2  2  ■  2irnX  .  2irn 

cn  =  ir  /  T  dt  +  ~  /  ^X)e  T  e  T  dX 

1  —  1  _  o_ 

2  2 

X 

2  .  2irn  ,  .  2rn  ■  2imo 

=  Y  /  T(t)(e  T  +  e  T  e  T  {dt 

_  a 
2 

T__°. 

■  nffa  2  2 

2  1  t  t  27m  ,,  a  v 

-  Ye  J  1<t)cos-Y'(t"  Y^dt‘ 


So, 


c 


n 


.  rnrof 


n 


14)  Error  bound:  An  error  bound  for  the  truncated  Fourier  series  has  been 

(x(t)l 


derived  by  Giardina  (GIAR77).  If  7  is  written  as  7  = 
€>0  there  exists  a  K  such  that  if  n>K  then 

(xnWl 


y(t) 


,  then  given 


Il7^7n11  <  e  >  where  7n  = 


y„M 


where 

K  =  max[V0T(x(t)),  V0T(y(t))|  . 

V0T(x(t))  and  V0T(y(t))  are  the  total  variation  of  x(t)  and  y(t)  respectively. 
The  Fourier  series  of  x(t)  and  y(t)  are  truncated  to  n  terms  to  form  xn(t) 


and  yn(t).  The  norm  is  defined  as 

_  f  sup  |x_xn|  sup  |y-yn|  , 
lhf-7„ll  -max  [0<t^T  >  0<t<T  1  ' 


3.3  Calculation 

Since  the  images  are  discrete,  connecting  the  contours  of  the  pixels  around 
the  boundary  produces  a  polygon.  The  Fourier  coefficients  can  be  computed 
directly  from  the  increments  in  time  (arc  length)  and  position  as  the  polygon  is 
traced.  (See  Figure  3.11.)  This  is  called  the  direct  Fourier  transform  (DFT). 
The  following  derivation  follows  along  the  lines  of  [KUHL82],  except  here  the 
complex  exponential  series  is  used.  Also,  a  much  more  simple  expression  for 
the  DC  term  is  obtained. 

Let  the  increment  in  position  be  the  complex  number  A7i,  and  the 
increment  in  time  will  be  At;  =  |  A^j  =  y^Ax2  +  Ay;2.  Since  7  is 
continuous,  its  derivative,  7,  exists  almost  everywhere.  So,  let 

.  .  2irn  . 

•  +00  1  „  t 

7  =  E  e 


be  the  Fourier  series  for  7.  But  the  series  for  7  can  be  differentiated  term  by 
term  so  that 


E 
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.  2 jt n  1/  a  2irn  . 


*  =  */*>•'  T  ,  =  1^,.  ■  at 

A  0  p-1  P  tri 


a  .  2*rn  A  .  2jrn  .  _ 

T  P=i  AtP 


We  can  then  write 


c„  - 


T  &  ^ p 


n  4*rV  pt;  Atp 


-  2 »rn  ^  .  2irn 

~=T-  l*  -=T-  Vi 
e  1  -  e  1 


where 


tp  —  ^  Atj ,  t0  —  0, 
i=l 

K  is  the  number  of  sides  on  the  polygon,  and  T  =  tK  = 
term, 

T  tp 

Co  =  i/lMdt  =  tJt  E  /l<t>dt 

1  0  1  p=ltri 


=  V  Etl/2A-,p(At„  +  2tp„)  +  fpAtp], 


rp  ZJ 

1  p=i 


^2 E 

At„ 


where  -  7p-i  ^t'tp-i»  7p  _7p-i  +  A7p,  ft  7 


period.  For  the  c0 


=  starting  point. 


Hence, 


co  =  -Tf  £  t1/2^  +  Vl)Atp. 

1  P=1 

For  the  discrete  images,  the  contours  are  represented  by  the  eight- 
direction  chain  code.  So,  the  A^  and  At;  can  be  determined  using  a  lookup 
table  indexed  by  the  chain  code.  . 

a;:  0  1  2  3  4  5  6  7 

Ay  1  1+i  i  -1+i  -1  -1-i  -i  1-i 

At:  1  \/2  1  y/2  1  v/2  1  v/2 

The  computational  complexity  is  of  order  N-K,  where  N  is  the  number  of 
coefficients  calculated  and  K  is  the  number  of  links  in  the  chain  coded  contour. 
Using  the  chain  codes  can  be  inefficient,  however,  when  there  are  runs  of 
identical  chain  codes.  By  requiring  additional  storage  for  two  vectors  (one 
complex  and  one  real),  the  computation  necessary  for  the  coefficients  can  be 
reduced.  It  is  necessary  to  compute  the  vectors  {At' ;}  and  (AYi)  for 
i  =  0, 1,  •  •  •  ,K'  K'  <  K,  where  AY ;  is  the  sum  of  the  Ay}  for  the  ith  run  and 
At' ;  is  the  sum  of  the  Atj  for  the  ith  run. 

An  alternative  to  calculating  a  direct  transform  of  the  polygon  is  to  use 
the  Fast  Fourier  Transform  (FFT),  whose  computational  complexity  is  of  order 
Nlog2N,  N  =  2m.  The  FFT  can  be  used  if  the  discrete  (complex)  values 
representing  the  contour  are  resampled  so  that  there  are  N  =  2m  discrete 
values.  If  the  resampling  can  be  done  efficiently  and  with  minimal  distortion  of 
the  signal  (shape),  then  the  FFT  can  be  used  to  compute  N  =  2m  coefficients 
quickly. 


However,  there  is  distortion  introduced  by  the  resampling  and  also  by 
assuming  that  the  signal  is  now  a  periodic  impulse  train  instead  of  a  continuous 
complex  valued  function.  If  the  contour  is  resampled  to  N  =  2m>>K,  then 
the  distortion  is  minimal,  but  now  the  sequence  of  data  values  is  much  longer 
incurring  a  greater  computational  burden.  The  resampling  operation  also  must 
be  included  in  determining  the  computational  complexity  of  this  approach. 

Since  the  contours  used  in  the  experiments  are  about  200  to  2000  chain 
code  links  and  32  complex  Fourier  coefficients  are  usually  calculated,  the 
resampling  followed  by  the  FFT  was  used.  The  resampling  method  used  linear 
interpolation  between  the  complex  data  values.  If  the  sequence  7  above  is  the 
input  to  the  resampling,  and  71  is  the  sequence  resampled  to  N  =  2m  >  K 
values,  then  the  nth  coefficient  using  the  FFT  is 
N-l-  -^fnj 

cn  =  ,  n  =  0, 1,  •  •  , N-l 

j=o 

and  where 


N-i-  -i^(N--n)j 


Ln  =  E^e  N 
j=o 


PM- 

E7je  =  cN_n. 


So,  the  FFT  gives  the  (-N/2  +  l)th  to  (N/2)th  coefficient. 

As  previously  mentioned,  this  assumes  that 

00  ,  jrp 

7(t)  =  E  7j  $(t-  ±7 ),  where  y  +  Nk  =  7j,  Vk. 

j  =~00 

Also  notice  that  the  |  7j| ’s  are  not  necessarily  equal.  So,  while  the  DFT  gives 
the  exact  results  for  a  continuous  uniformly  traced  polygon,  the  FFT 
implementation  is  based  on  a  sequence  which  is  not  uniform  sampling  along  the 
contour.  Sometimes  the  sequence  7p  »s  very  much  over  sampled  in  order  to 


cause  to  be  approximately  uniform  sampling.  But  if  a  large  amount  of  over 
sampling  is  performed,  then  the  computational  advantage  of  the  FFT  is  lost. 

One  advantage  of  the  DFT  method  is  that  only  the  coefficients  actually 
desired  need  be  calculated.  On  the  other  hand  with  the  FFT  method,  you 
obtain  2m  coefficients  whether  or  not  that  many  are  actually  desired. 

3.4  Normalization 

It  is  usually  desirable  to  compare  shapes  independent  of  size,  orientation, 
and  starting  point.  To  compare  two  shape  boundaries  using  their  Fourier 
descriptors,  it  is  necessary  to  scale,  rotate,  and  shift  their  shapes  in  order  to 
allow  the  “best”  fit  possible.  This  operation  normalizes  the  Fourier  descriptors 
for  the  unknown  shape  to  an  optimum  orientation.  It  has  been  shown 
[PERS77]  that  the  optimum  scale,  rotation  angle,  and  relative  starting  point 
shift  can  be  obtained  to  minimize  the  mean-squared  error  as  the  criterion. 

Given  two  shapes  7  and  Y  which  have  coefficients  cn  and  c'  n,  respectively, 
minimize  the  distance  function 


where  s,  <f> ,  and  a  are  the  scale,  relative  starting  point,  and  rotation, 
respectively.  In  practice,  only  N  coefficients  are  used  to  represent  the  shape.  In 
that  case,  the  sum  is  taken  over  the  finite  set  of  coefficients.  So,  it  is  necessary 
to  minimize  the  following: 

£  !  c„  -  se'l"*  +“Vn|  2 

n=  N/2,  n  /0 

Next  the  partial  derivatives  are  taken  with  respect  to  s,  <f>,  and  a  and  they  are 


set  to  zero.  Then  letting  cn*c' n  —  pne'^%,  the  following  three  coupled  equations 
are  obtained  : 

5>ncos(^n  +  n<£  +  a) 


s 


c'  c!  * 
^  n  ^  n 


(i) 


tan(a)  = 


Elnsin(^n  +  M) 
5>nc°s(^n  +  n<(>) 


tan(o)  =  - 


X)npnsin(0n  +  n^) 
Y^npncos(ipn  +  n  <j>) 


Combining  (2)  and  (3)  into  one  equation  in  <j> 

f(<P)  =  E/’n^V’n  +  n^)  E  0  /,nCOS(V’n  +  n0) 

~  EPncos('/’n  +  n^)En^nsin(^„  +  <j>) 


(2) 

(3) 


The  optimum  value  for  <f>  is  obtained  by  finding  the  roots  of  f (<j>).  Then  the 
corresponding  values  of  s  and  a  can  be  obtained  from  equations  (1)  and  (2). 
Since  usually  only  the  -N/2th  to  the  N/2th  components  are  retained,  the  roots 
of  f(<^)  can  be  obtained.  However,  this  is  very  expensive  to  compute,  especially 
when  the  unknown  has  to  be  compared  to  many  templates. 

Instead  of  finding  the  zeroes  of  the  transcendental  equations  above,  several 
investigators  [PROF82,RICH74]  have  used  simple  correlation  methods  to 
determine  a  good  starting  point  and  rotation  normalization.  The  correlation 
between  two  contours  is  usually  implemented  by  multiplying  the  Fourier 
transforms,  taking  the  inverse  transform,  and  then  finding  the  peak. 


Still  another  (GRAN72]  has  used  algebraic  combinations  of  the  Fourier 
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coefficients  to  obtain  starting  point  and  rotation  invariance. 

To  reduce  the  computational  burden  to  normalize  the  coefficients,  many 
others  have  suggested  that  one  of  several  suboptimum  methods  be  used. 
Instead  of  normalizing  the  coefficients  of  the  unknown  differently  for  each 
template,  the  coefficients  are  normalized  to  a  “standard”  orientation 
independent  of  the  template. 

First  the  object  is  translated  to  the  origin  by  setting  c0  =  0.  Because  the 
fundamental  or  Cj  has  always  been  observed  to  be  the  largest  (n^O),  all  the 
coefficients  are  scaled  by  dividing  by  |  Cj| .  Next,  the  shape  is  rotated  and  the 
starting  point  shifted.  Most  of  the  variations  among  the  suboptimal  methods 
described  in  the  literature  occur  in  how  this  rotation  and  starting  point  shift  is 
obtained. 

The  simplest  method  is  to  rotate  and  shift  so  that  the  transformed 
coefficients  c' ,  and  c'_,  have  zero  phase  [CRIM82,KUHL82j.  So,  the 
normalized  coefficients  c1  „  are  as  follows: 


c'„  = 


_n —  i(nto  +  q,)  ,  0 

cjl  ’  c  0  u’ 


c' 1  —  |c',|  e1^1  and  c'_,  =  |c'_,| 


where 


This  requires  that  to  and  a  to  be 

..  _  -Z.c_,  _  Lcx  +  Z.c_! 

to  - - 2 -  and  *  “ - 2 - * 

This  is  equivalent  to  rotating  and  shifting  the  starting  point  so  the  the  c,  and 
C-|  components  describing  a  fundamental  ellipse  has  its  major  axis  along  the 
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x-axis. 


There  are  two  problems  with  this  approach.  The  magnitude  of  c_j  may  be 
zero.  So,  the  phase  angle  is  indeterminate.  Second,  there  are  two  possible 
orientations  which  satisfy  the  zero  phase  condition.  An  additional  180  ° 
rotation  and  -T/2  starting  point  shift  would  also  satisfy  the  zero  phase 
condition. 

Alternatives  to  this  approach  can  help  to  mitigate  these  difficulties 
[WALL80a,WALL80b,MITC82a,MITC82b].  The  following  algorithm  is  used  in 
this  report  to  normalize  the  shape  coefficients: 

1)  Set  c0'  =  0. 

2)  Scale  so  that  |  c,'  |  =1. 

3)  Find  the  coefficient  ck  that  is  next  largest  (|k-l|  <5,  k  *0, 1.)  If 
]  k  -  lj  >5,  then  let  k  =  2. 

4)  Rotate  and  shift  the  starting  point  so  that  Ley  —  0  and  Lck'  =  0,  i.e. 

.  l  =  Cn  i(nto  +  a) 

V'n  |  .  ,  ’ 

I  C»| 

.  _  (kZ.Cl-Z.ck) 

»hefe  t.  -  « - - 

The  object  ^(t)  =  c,  e  r  +  ck  e  T  has  j  k  —  1 1  -fold  symmetry.  So, 

2j|* 

there  are  1  k  -  1 1  rotations  and  relative  starting  point  shifts,  multiples  of  - — - 


that  will  satisfy  the  zero  phase  condition.  So,  rotate  and  shift  the  starting 
point  so  that 
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•  2<rm  _  ■  2im 

8.=c/e“k-«",‘-«,  m=0,l,  ,(|  k~l|  -2). 

to  maximize  the  following  criterion: 

1  £Re{cn}  |Re{6n}|. 

n 

This  criterion  effectively  chooses  the  normalization  that  orients  the  contour  so 
that  the  axis  of  one  of  the  main  lobes  of  the  |  k  —  1 1  -fold  fundamental  shape 
^e11  +  ckelkt  is  along  the  positive  x-axis  and  the  starting  point  on  the  contour 
corresponding  to  that  lobe  is  the  one  farthest  from  the  origin.  In  order  to 
reduce  the  number  of  rotations  to  perform  and  calculate  the  criterion  only 
|  k  -  1 1  <  5  is  allowed.  If  |  k  —  1  j  >5,  then  k  =  -1  as  in  the  first 
normalization  method. 

The  correct  recognition  of  a  shape  is  very  sensitive  to  this  rotation  and 
starting  point  normalization.  In  order  to  improve  the  classification  accuracy, 
multiple  sets  of  Fourier  coefficients  are  used  in  classifying  the  shape.  The 
descriptor  for  the  best  normalization  above  is  used.  In  addition,  if  one  of  the 
multiple  rotations  has  a  value  of  its  criterion  that  is  within  95%  of  the  best, 
this  normalization  is  also  used.  To  reduce  the  possibilities  that  the  wrong 
coefficient  is  used  in  the  normalization,  the  third  largest  coefficient  is  also  used 
if  it’s  magnitude  is  within  95%  of  the  second  largest  coefficient.  Again,  the 
next  best  normalization  based  on  this  third  largest  coefficient  is  also  used  if  the 
criterion  is  95%  of  the  best  normalization  based  on  this  new  coefficient.  So  in 
all  there  can  be  as  many  as  four  sets  of  coefficients  for  each  unknown  shape. 
For  the  library  features  only  the  best  normalization  is  used.  One  of  the  four 
possible  normalizations  is  likely  to  match  the  proper  template  even  when  some 
noise  is  present. 
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3.5  Generic  Shapes 

In  order  to  recognize  a  generic  shape  (one  of  a  class  of  shapes),  it  is  useful 
to  know  the  analytic  expression  for  the  characterization.  In  this  section  are  the 
analytic  formulae  for  several  basic  shapes.  Each  generic  shape  is  represented 
by  a  set  of  parameters  and  the  formula  for  the  Fourier  coefficients.  In  order  to 
recognize  a  shape,  these  parameters  are  estimated.  The  exact  coefficients  can 
be  calculated.  Then  a  comparison  of  the  coefficients  of  the  unknown  and  the 
model  estimate  can  be  made  in  order  to  make  a  decision  (MITC81). 

Other  properties  of  the  shape  such  as  bilateral  and  axial  symmetry  can 
also  be  investigated.  For  instance,  the  next  largest  coefficient,  |  ck| ,  after  |  c,| 
can  be  used  to  determine  the  fundamental  shape  of  the  object. 

If  |  ck|  is  the  next  largest,  the  shape  has  |  k  —  1 1  -fold  symmetry.  Figure 
3.12  displays  a  library  of  fundamental  shapes  indexed  by  the  value  of  k  and  the 
ratio  of  the  modulus  of  the  coefficients  {MOEL82J.  The  contours  in  this  figure 
correspond  to  ^(t)  =  c,  e!t  +  ck  elkt,  i.e.  inverting  the  Fourier  descriptor  having 
the  two  components  Cj  and  ck.  The  frequency  ratio  is  l:k  and  the  modulus 


ratio  is  — 
1  <k 


Figure  3.13  shows  four  rows  of  the  fundamental  shape  library 


displaying  equal  time  interval  sample  points.  The  non-uniform  sample  spacing 
which  occurs  when  there  are  only  a  finite  number  of  coefficients  can  be 
observed. 


If  most  of  the  energy  is  in  the  odd  harmonic  components,  then  the  shape 
has  (almost)  axial  symmetry.  The  test 

‘*Ek|*>  £  kk 

n  odd  n  even 


where  0  <  t  <  1  can  determine  if  the  shape  has  (almost)  axial  symmetry. 


MODULUS  RATIOS 
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FREQUENCY  RATIO 
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Figure  3.12  Fundamental  shape  library  (MOEL82). 
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FREQUENCY  RATIO 


Figure  3.13  Fundamental  shape  library  showing  sample  points. 
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The  shape  can  be  tested  for  (almost)  bilateral  symmetry  (with  respect  to 
the  x-axis)  by  determining  if  the  components  are  nearly  all  purely  real  after 
normalization.  If 

t  x£|  Re{c„)| 2  >  S|  tm{c„}| 2  , 

n  n 

then  the  shape  has  (almost)  bilateral  symmetry. 

Following  is  a  collection  of  important  generic  shapes  [LAWR72.YATE47] 
with  the  corresponding  Fourier  descriptors. 

1)  Circle:  The  algebraic  equation  for  a  circle  centered  at  the  origin  is 
x2  +  y2  =  r2.  The  substitutions  x  =  rcost  and  y  =  rsint  also  satisfy  the 
equation.  So, 

T(t)  =  rcost  +  rsint  =  re'1,  t  6  [0,2jt] 

describes  a  circle  of  radius  r.  It  has  only  the  component,  c,  =  r. 
Unknown  shapes  can  be  tested  for  their  “circularity”  by  determining  the 
proportion  of  the  energy  in  the  c1  component  relative  to  the  sum  of  the 
rest.  If 

tx|c,|  >  £  |  cnJ 2  , 

n,n»*0,l 

then  the  shape  is  approximately  circular. 

2)  Ellipse:  The  algebraic  expression  for  an  ellipse  centered  at  the  origin  is 

x2  y2 

— —  +  =  1.  Substituting  x  =  a  cost,  and  y  =  bsint  satisfies  the 

a2  b2 

equation  as  t  — »0  to  2n  tracing  out  an  ellipse.  The  ellipse  has  a  major  axis 
of  length  “a”  along  the  x-axis  and  a  minor  axis  of  length  “b”  (a  >  b.) 
This  gives 


V 
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7(t)  =  acost  +  ibsint  =  l/^a  +  b)elt  +  l/^a-bje"'1. 


&  "t*’  b  ^ 

S°,  Oj  =  — - —  and  c_j  =  — — — .  This  is  why  it  was  stated  earlier  that 

M  m 

normalizing  on  ct  and  c_j  is  the  same  as  placing  the  fundamental  ellipse 
best  fitting  the  shape  in  its  standard  position.  However,  this  shape  is  not 
uniformly  traced  since 


a2sin2t 


+ 


*  constant  . 


The  elongation  of  a  shape  can  be  estimated  by  the  ratio 

i  -•  _  __  I  c-i|  _  a-b 

elongation  —  e  =  -  =  — — —  . 

|  c,|  a  +  b 

If  the  t  is  zero  the  object  is  nearly  circular.  If  t  is  greater  than  0  the  object 
is  elongated.  The  value  of  e  will  always  be  less  than  1  for  a  simple  closed 
contour  that  does  not  cross  (traced  counterclockwise). 

3)  Rectangle:  Shown  in  Figure  3.14  is  a  rectangle  in  standard  position  having 
a  length  of  a  and  a  height  of  b.  It’s  starting  point  is  located  on  the 
positive  x-axis.  The  rectangle  can  be  described  by  the  list  of  {A^},  {At,}, 
70,  and  T  obtained  when  tracing  the  rectangle  counterclockwise. 

{A7;}  =  {  b/2,  -a,  — ib,  a,  ib/2  } 


{At}  =  {  b/2,  a,  b,  a,  b/2  }  T  =  2(a  +  b),  70  =  a/2  +  *°- 

This  data  is  used  to  calculate  the  Fourier  coefficients  from  the  contour 
using  the  DFT  given  the  following  formula 
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ib/2  starting  | 

/  point 

-a/2 

a/2  b 

-ib/2  | 

Figure  3.14  Rectangle. 


Figure  3.15  Isosceles  triangle. 
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2(a  +  b)  .  mr  ,  n xp  ,  .  n xp  * 
?„  =  ■  *- '■  lsid-—  {cos— -f-  +  sin— — M 
n  jrV  2  1  2  2  1 


cn  =  0,  (n~l)mod2  *  0 


c0  =  0 , 


where  p  -  — — — .  Notice  that  the  rectangle  exhibits  the  properties  of  2- 
a  +  b 

fold  and  bilateral  symmetry.  The  coefficients  can  be  normalized  to 

.  D7T  ,  n7rp  .  .  njrp , 

sin —  {cos —  +  sm-'r  } 

2  2  2 


n2  {cos-^-  4-  sin^-} 

mt  mt 


The  nth  component  is  a  function  of  only  n  and  the  parameter  p. 
The  c,  and  c_i  coefficients  can  be  used  to  estimate  p,  as 

>  =  i  . 

X  c,  -  c_. 


A  shape  can  be  tested  whether  or  not  it  is  rectangular  by  computing 
coefficients  of  a  rectangle  based  on  the  the  estimate  p.  Then  decide  based  on 
the  distance  between  the  normalized  unknown  components  and  those  for  the 
rectangular  model  estimate.  If  1/2  <  p  the  rectangle  is  in  standard  position. 
If  p  >  1/2  then  it  is  not  and  the  roles  of  a  and  b  should  be  reversed.  The 
report  by  Mitchell  and  others  [M1TC81]  describes  the  use  of  the  above  generic 
shape  recognition  procedures  to  classify  shape  in  imagery  to  assist  in  aerial 
photo  reconnaissance.  The  above  formula  for  the  rectangle  also  includes  the 
special  case  of  p  —  1/2,  corresponding  to  a  square,  when 
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2(a  +  b)  .  n?r  ,  nirp  ,  .  nxp  ^ 

c_  =  v  ~~  sin —  {cos-—11-  +  sin-1-} 

7r*n“  2  2  2 

cn  =0,  (n  -  l)mod  4/0. 


4)  Isosceles  triangle:  Shown  in  Figure  3.15  is  an  isosceles  triangle  whose  two 
equal  sides  are  of  length  a,  and  base  is  of  length  2b.  The  triangle  is  in 
standard  position,  and  c0  =  0.  The  list  of  {A^},  {At;},  70,  and  T  are 

{A^j}  =  {  -\/a2-b2  +  ib,  -i2b,  \/a2  -  b2  +  ib  } 

{At;}  =  {  a,  2b,  a  },  T  =  2(a  +  b),  7o  =  fv42-b2  +  iO. 

4 


This  gives  the  following  formula  for  the  coefficients: 


a  +  b  1 


r'n2 


{  (a  +  b)sin[ 


7rna 
a  +  b 


+  \/(a  4- b)(a-bj(l  -  cos{  7r^a  ])},  n  #0 

a  +  b 


c0  =  0. 

Notice  that  the  triangle  has  bilateral  symmetry.  For  the  case  where 
2b  =  a,  the  isosceles  triangle  becomes  an  equilateral  triangle  with 

cn  =  0,  (n- l)mod  3/0. 

Observe  that  the  triangle  has  3-fold  symmetry. 
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5)  Line  shape:  Shown  in  Figure  3.16  is  the  simplest  line  shape  having 
{Ayt)  =  {  -2,  2  } 

{AtJ  =  {  2,  2  }  70  =  1  +  iO,  T  =  4. 


The  coefficients  are 
4 


c„  = 


n  *v 


,  (n  -  l)mod2  =  0 


cn  =  0,  otherwise. 

The  normalized  coefficients  are 

cn  =  ~  ,  (n-l)mod2  =  0 
n- 

cn  =  0,  otherwise. 

The  shape  is  in  standard  position  an  th*  starting  point  is  on  the  right 
endpoint  ( a  =  0.) 

6)  Thin  cross:  Figure  3.17  depicts  a  thin  cross.  The  list  of  {Ay}  and  {At;} 
are 

{A7j}  =  {  -1,  i,  -i,  -1,  1,  -i,  i,  1  } 

{At;}  =  {  1,  1,  1,  1,  1,  1  },  T  =  8,  70  =  1  +  iO. 

Notice  that  this  is  not  a  ljne  shape.  It  does  not  retrace  itself  in  the 
opposite  direction  in  one  period.  The  coefficients  for  this  shape  are 

cn  =  {[I -(-l)nJ  -  [cos-S.  +  Sin-J-J  +  2sin-y- 
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+  [cos^-sin^l},(n-l)mod4  *0 
cn  =  0  ,  otherwise. 

The  contour  of  a  object  can  be  reconstructed  from  its  Fourier  coefficients 
by  taking  the  inverse  Fourier  transform.  Figure  3.18  show  an  F104  contour 
reconstructed  from  finite  sets  of  coefficients.  Figure  3.19  shows  the  Fourier 
descriptor  magnitudes  for  the  contour.  The  original  F104  contour  is  shown  in 
Figure  4.2. 


3.8  Experimental  Results 

The  experiments  in  section  2.3  were  carried  out  for  the  Fourier  descriptors 
of  the  boundary.  The  feature  vector  was  formed  by  calculating  and 
normalizing  the  coefficients.  After  normalization,  the  c0  and  c,  components  did 
not  carry  any  shape  information  and  were  dropped  from  the  feature  vector. 
The  feature  vector  was  then  formed  by  listing  the  real  and  imaginary  parts  of 
the  remaining  components  as 

f  =  (Re{c_1},Im{c_1},Re{c2},Im{c2}, 

Re{c-2},  Im{c-2}, ...,  Re{cN/2},  Im{cN/2})T  . 

This  is  the  full  feature  vector.  The  32  coefficients,  c_,8  to  c15,  were  used  for 
most  of  the  experiments.  The  feature  vector  was  then  reduced  by  retaining  the 
12  first  elements  of  the  feature  vector  after  applying  the  principle  components 
transformation.  Now  the  results  of  each  basic  experiment  will  be  presented 
and  discussed. 


V 


Harmonic  number 


Figure  3.19 


Fourier  descriptors  of  the  boundary  (magnitude)  for  F104 
contour. 
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The  imaging  resolution  experiment  was  carried  out.  The  classification 
results  are  presented  in  Table  3.1.  The  median  x  and  y  angle  error  between 
the  correctly  classified  aircraft  and  the  closest  view  in  the  library  are  shown  in 
Table  3.2.  The  classification  accuracy  increases  as  the  resolution  of  the 
unknown  increases  for  a  fixed  library  resolution  until  the  resolution  of  the 
unknown  and  the  library  are  the  same.  Then  it  begins  to  decrease.  It  is 
interesting  that  the  accuracy  decreases  with  increasing  library  resolution  for  a 
fixed  unknown  resolution,  if  the  unknown  is  of  less  resolution.  So,  in  order  to 
obtain  the  best  possible  accuracy,  the  unknown  and  library  resolution  should 
be  as  close  as  possible.  If  the  library  must  be  at  a  fixed  resolution,  then  the 
128x128  library  has  the  highest  average  classification  accuracy  for  these  shapes. 
This  is  contrary  to  the  idea  that  the  highest  library  resolution  must  be  the 
best.  The  median  angle  errors  follow  the  same  trend.  The  y  angle  errors  are  in 
general  smaller  than  those  for  the  x  angle  errors.  This  is  probably  due  to  the 
actual  aircraft  shapes  being  used. 

The  results  for  the  feature  vector  experiment  are  provided  in  Table  3.3. 
The  upper  portion  of  the  table  lists  the  results  using  the  full  feature  vectors. 

i 

The  lower  portion  contains  results  for  when  the  principle  components 
transformation  was  used  and  the  12  features  corresponding  to  the  largest 
eigenvalues  are  retained.  As  the  number  of  features  is  increased,  the 
classification  accuracy  increases.  From  the  sudden  increase  in  classification 
accuracy,  and  the  sharp  decrease  in  median  angle  error  between  using  4  and  12 
features  indicates  that  most  of  the  necessary  information  for  these  shapes  are 
contained  in  the  first  8  F ourier  coefficients. 

Table  3.4  contains  the  results  of  the  imaging  noise  experiment.  The 
feature  vector  consisted  of  32  Fourier  coefficients  reduced  to  12  real  numbers. 


Table  3.1  FDS  image  resolution  experimental  results:  classification 
accuracy  (%). 


Unknown 

Resolution 

16 


16 

32 

Prototype  Resolution 
64  128 

256 

512 

54.33 

59.33 

49.33 

45.33 

43.0 

41.33 

64.0 

86.33 

77.0 

72.0 

70.33 

70.0 

60.33 

88.67 

91.0 

91.33 

87.33 

87.0 

55.67 

81.33 

91.0 

93.67 

92.0 

92.0 

50.0 

79.33 

90.33 

92.67 

92.33 

93.0 

50.33 

78.0 

89.0 

93.0 

92.67 

92.67 
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Table  3.2  FDS  image  resolution  experimental  results:  median  angle  error 
Mx,My  (  °  )• 


Unknown 

Resolution 

16 

Prototype  Resolution 

32  64  128  256 

512 

16 

11,7 

12,8 

10,8 

10,9 

12,9 

10,8 

32 

7,5 

8,5 

7,6 

7,6 

7,5 

64 

10,7 

7,5 

7,4 

6,4 

7,4 

7,4 

128 

11,7 

7,5 

7,4 

4,4 

4,4 

4,4 

256 

11,7 

7,5 

7,4 

4,4 

4,4 

4,4 

512 

14,8 

7,5 

7,5 

6,4 

6,4 

4,4 

Table  3.3  FDS  feature  vector  experimental  results. 


Number  of 

Number  of 

Classification 

Median  Angle 

Fourier 

reduced 

Accuracy 

Error 

Coeffs. 

Features 

(%) 

( ° ) 

r  4 

4 

wgt&mm 

HES Hi 

8 

12 

mKSmm 

16 

28 

32 

60 

64 

124 

■beH 

16 

12 

4,4 

32 

12 

4,4 

64 

12 

4,4 

66 


The  performance  improves  consistently  with  increasing  signal-to-noise  ratio. 

Table  3.5  presents  the  results  of  performing  the  library  sampling 
experiment.  This  data  indicates  that  even  more  views  might  be  necessary  to 
obtain  very  high  classification  accuracy. 

In  Table  3.6  are  the  results  for  classifying  partial  shapes.  The  method 
performs  better  than  might  be  expected  for  a  global  method  up  until  more 
than  10%  of  the  contour  is  chopped.  Then  the  performance  drops  off  quickly. 

3.7  Conclusions 

The  Fourier  descriptors  of  the  boundary  perform  very  well  under  varying 
circumstances.  If  more  sophisticated  classification  procedures  with 
interpolation  were  used  then  the  results  would  even  be  better.  The  Fourier 
series  is  well  understood.  This  greatly  facilitates  their  usefulness  for  shape 
description  and  recognition. 
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Table  3.5  FDS  library  sampling  experimental  results. 


Library 

Views 

Library 

Resolution 

Unknown 

Resolution 

Classification 
Accuracy 
(%)  ' 

Median  Angle 
Error 

A^,A0v  ( ° ) 

143 

256 

64 

mmswnmm 

smsacMmmm 

49 

'  ::  -  ; 

■ 

9 

143 

256 

128 

4,4 

49 

79.67 

9,6 

9 

54.67 

25,13 

143 

256 

256 

92.33 

4,4 

49 

79.67 

9.6 

9 

55.0 

25,13 

08 


Table  3.6  FDS  partial  shape  experimental  results. 


Percent 

Contour 

Chopped 

Classification 

Accuracy 

(%) 

Median  Angle 
Error 

92.0 

4,4 

83.67 

7,5 

58.0 

9,7 

IKS 

30.33 

34,15 

40 

22.0 

57,27 

50 

12.67 

62,33 

CHAPTER  4 

WALSH  POINTS  OF  THE  BOUNDARY 

4.1  Introduction 

The  Walsh  functions  have  often  been  used  as  the  basis  for  functional 
approximation  because  the  piecewise  constant  form  leads  to  efficient 
computation  (GIAR83].  Also,  since  the  basis  functions  are  non-zero  over  the 
entire  interval  of  definition,  this  is  a  complete  global  shape  analysis  method. 
Instead  of  the  transform  coefficients,  this  method  uses  the  basis  functions  to 
formulate  the  computation  of  Walsh  points  for  the  shape  feature  set.  The 
Walsh  points  are  a  collection  of  points,  xk  +  iyk,  k  =  0, 1, 2, ..., 2m-  1,  that 
have  a  property  derived  from  the  fact  that  the  Walsh  functions  were  used  in 
formulating  their  calculation.  The  projections  x(t)  and  y(t)  are  approximated 
to  within  a  prespecified  degree  of  accuracy  by  a  piecewise  constant  function,  a 
truncated  Walsh  series. 

The  Walsh  functions,  Wj(t),  are  products  of  the  Rademacher  functions, 
r;(t).  The  Walsh  functions  for  a  complete  orthogonal  basis  set  complete  among 
square  integrable  functions  on  the  interval  (0,1].  The  Walsh  functions  are 
defined  as 

W0(t)  =  1. 

wn(t)  =  rn,  +  i(t)  r„1+  i . rn>  +  l(t), 


where  n  >1  is  expressed  in  binary  as 
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n  =  2"'  +  2n*  +  •  ■  •  +  2n» 

and  the  integers  nj  are  ordered  such  that 

n,  <  n2  <  •  •  •  <  Dp  . 

The  Rademacher  functions  are 
r0(t)  =  1 


ri(t)  = 


+  1,  t  G 10, 1/2) 

-i,  te(iAi) 


ri(t  +  1)  =  rj(t) 


rk  +  i(t)  =  r1(2k  t) ,  k  —  0, 1, 2,  •  •  • 

In  the  following  only  the  x  projection  of  the  boundary  calculation  will  be 
discussed.  The  same  results  follow  for  the  y  projection. 

The  Walsh  series  for  x(t)  is 

*(t)  =  £«iWi(t/T) 
n=0 

where 

T  l 

«i  =  J*M  Wj(t/T)  dt  =  T/x(Tt)  W.,(t)  dt 
0  o 

T  =  total  arc  length  along  7(t)  =  x(t)  +  iy(t)  . 


Each  a,  is  made  up  of  the  area  under  the  portions  of  x(t)  added  and  subtracted 
together.  In  approximating  x(t),  only  a  finite  number  of  terms  in  the  Walsh 
series  are  retained  -  say  the  first  2m.  If  the  series  is  truncated  to  a  finite 
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number  of  terms,  there  is  an  approximation  error.  It  would  be  expected  that 
as  N  =  2m  increases,  the  closer  x2.(t)  approximates  x(t).  This  error  will  be 
discussed  later. 

Let 

Xo“(t)  =  E  ak  Wk(t/T). 
k=0 

Since  the  function  x2«(t)  (or  y2m(t))  describes  a  piecewise  constant  function, 
there  are  2m  discrete  points,  xk  +  iyk,  that  are  the  sum  of  the  heights  of  the 

Walsh  functions  over  each  interval,  f-^-T,  — — -  T),  k  =0,1,  •  •  •  ,  2m  —  1 , 

2m  2m 

which  approximate  the  x  (or  y)  projection  of  7(t)  =  x(t)  +  iy(t). 

It  was  mentioned  earlier  that  the  Walsh  functions  allow  for  an  efficient 
computation  of  the  function  x2m(t)  (or  y2«(t).)  The  ith  Walsh  function  can  be 
written  as 

Wi(t)  =  X  T'ik  *,_k_  k  +  1  ■(*■)  - 
k  =0  > 2"  ’  2" 

where 

7ik=Wi(t)  =  ±l,  l6|i,  *±I). 

The  matrix  of  7ik,  H  =  (Hikj  =  [7ik],  is  the  Hadamard  matrix  with  2m  rows  and 
columns. 

The  ith  coefficient  of  the  Walsh  series  is 
a,  =T/x(Tt)Wi(t)dt  =T2SI7ik/x(Tt)X  k  k  +  I  dt 

0  k=0  0  1 2-  ’  «m  ) 


<k  ±  III 

2" 

»,  =  2"e\*  /  x(t|  <H 

k=0  kX 

2 ■ 


2“  -  1 

«i  =  E  ^ik  Axk, 
k=0 


where 


A,k  =T  /  x(Tt)dt  =  /  x(t)dt 

_k_  kT 

2“  2" 


A.l  is  the  area  under  x(t)  over 


the  basic  interval  ^  )■  By  letting 

2m  2 


Ax  = 


be  the  Area  vector  and 


l0f2"  -  1 1 


be  the  sequency  vector,  we  can  write 


a,  =  H  A. 


The  Truncated  Walsh  series  is 


x»W  =  "s'ai  W,(t/T) 
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2"  -  1 

£«i 

i=0 


E  Tk  E  k  k  +  1  .(t/T) 


k=0 


2“  -  1 

E 

k=0 


2“  -  1 

E  ai  Tk 

.  i=0 


l  r>»  1  o® 


X|i  I.  i  I  /t/T) 

l 2®  ’  2® 


E  ck  *.Jk_  k  +  1  \l/T)  , 

k  —0  '  o®  ' 


where 


2“-  I 

ck  -  E  ^ 

i  =  0 


The  constants  ck  can  be  found  by  first  forming  the  vector  ?2»  as 


(2) 


*s-  = 


c2*_  ] 


and  noticing  that  as  in  equation  (1)  that  equation  (2)  can  be  written  as 

5?2“  —  HT  Qx 

where  HT  is  the  transpose  of  the  Hadamard  matrix,  H. 

Since  H  is  symmetric  H  =  HT  and 

=  H2  Xx  . 

But  because  the  Walsh  functions  are  orthogonal,  it  follows  that  H2  =  2m. 
Hence, 

To»  =  2m  A.  . 

So,  the  truncated  Walsh  series  gives  a  piecewise  constant  function  whose 


i 


i  it  it m  rm 
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heights  are  proportional  to  the  area  under  the  projection  on  each  basic  interval, 


x2-(t)  -  2m  5]  Akx  X  kr  (k  +  utM- 

k=0  2“  ’  2* 


4.2  Truncation  Error  and  t-Net 

In  the  previous  section  the  expansion  for  the  truncated  Walsh  series  was 
derived.  The  error  in  the  expansion  between  x(t)  and  x2.(t)  using  the  norm 


lx(t)  Xjj-ItJlloo  0<t<l  I  X(t)  x2*‘(t')|  £  om  +  l 


where  M  is  the  maximum  slope  in  the  x  projection.  This  follows  from  the  more 
general  result  [GIAR78),  that  if  x(t)  satisfies  a  Holder  condition  of  order  a, 
0<a<l,  with  constant  M,  that  is 


|x(t+h)-x(t)|  <  M  |  h|  ° ,  M>0 


for  t  and  h  real,  then 


I  1 X (t)  —  X2n(t)ll00  < 


(2°-  l)(2°)m  +  I 


In  particular  if  a  =  1  (which  is  the  case  for  polygonal  curves),  we  have 

M 

llx(t)  -  x^t)!^  <  . 


An  t-net  (c>0)  for  an  arbitrary  bounded,  closed  curve  7  3  (0, 1]  -*  IRxIR 
is  a  collection  of  a  finite  number  of  points  W  3  [0, 1)  -*  RxIR  where 
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L  L 

w  =  U{(xit  yi)1 1}  =  .U{wi}  . 

i=  1  i-i 

The  anion  of  the  closed  balls,  (jx-x;j  ,  |  y  —  y; j }  <  c,  covers  the  sot  of 

points  (x,y)  of  7.  That  is,  every  point  of  7  must  be  within  a  distance  c  from 
some  point  in  W. 

For  a  discrete  image  of  an  object  silhouette,  x(t)  (or  y(t))  is  the  projection 
of  a  Freeman  encoded  boundary.  Hence  x (i)  is  a  piecewise  linear  function  that 
satisfies  Lipchitz  condition  with  M  equal  to  the  maximum  absolute  slope  of  the 
linear  elements.  The  important  fact  is  that  the  Walsh  points  in  the  vector  72» 
form  an  c-net  for  the  curve  7.  Figure  4.1  is  a  sketch  of  an  c-neighborhood.  If 
we  take  any  point  p  on  the  curve  7  and  draw  a  closed  ball  (  a  square  with  sides 
of  length  2c  centered  on  p),  then  since  W  is  an  c-net  for  7  there  must  exist  a 
point  w-,  £  W.  However,  if  W'  is  also  an  c-net  for  the  curve  7,  then  there  must 
also  be  a  point  w'  £  W'  in  the  same  c-neighborhood.  So,  the  farthest 
Euclidean  distance  that  w  can  be  from  w'  is  2\/2e.  This  is  key  to  a  recognition 
procedure,  because  the  Walsh  points  for  a  shape  will  be  located  within  this 
degree  of  accuracy. 

The  curve  7  can  be  approximated  to  a  prescribed  error  c  or  it  can  be 
approximated  by  a  fixed  number  of  points  resulting  in  error  bounded  by  c. 
Given  a  prescribed  error  c,  the.  number  of  Walsh  points  will  be  2m  where 

m  =  max(n,  k) 

where 

n  =  (log2( and  k  -  (log2( y ) -  1 J 


|x(t)-x2.(t)|  <  and  |y(t)  -  y2k(t)|  <  ^7 

M  and  L  are  the  Lipchitz  constants  for  x  and  y  respectively.  Alternatively,  the 
number  of  Walsh  points  can  be  fixed  at  N  =  2m  and  the  accuracy  is 
determined  by 

€  =  max(cx,  «y|,  where 


_  M 

2m  +  i 


and 


_  L 

nm  + 1 


In  the  recognition  experiments,  a  fixed  number  of  Walsh  points  were  used. 


4.3  Normalization 


In  order  to  compare  the  Walsh  points  feature  vector  of  an  unknown  to  a 
library  prototype,  normalization  with  respect  to  translation,  scale,  rotation,  and 
shift  in  starting  point  must  be  accomplished. 


The  traced  contour  from  the  image  grid  is  stored  using  Freeman  chain 
code  links.  When  the  shape  features  are  to  be  extracted,  the  chain  code  is 
converted  to  a  complex  vector  7  =  5?  +  i^.  The  increments  in  the  arc  length 
for  the  piecewise  linear  segments  connecting  the  grid  points  are  calculated. 
The  c0,  C|,  and  c_,  Fourier  coefficients  of  the  boundary  are  calculated  to 
provide  the  global  information  necessary  for  translation,  scale,  and  rotation. 
The  translation  is  -eg.  The  scale  factor  s  -  f  Cj| .  The  rotation  angle  is 


Z.c,  4-  Lc_i 

— - - - .  The  complex  vector  7  = 


is  normalized  to  ,  where 


the  ith  component  is 
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V  i —  hi -c0)  e*\  and  V  =  T/s. 
s 

This  normalization  moves  the  center  of  the  shape  to  the  origin.  It  scales  and 
rotates  the  shape  so  that  the  major  axis  corresponding  to  the  fundamental 
ellipse  is  along  the  x-axis  and  has  a  length  of  2.  The  starting  point  is  moved 
by  finding  the  kth  point,  %,  which  is  closest  to  one  of  the  two  points  where  the 
fundamental  ellipse  crosses  the  x-axis.  An  additional  180  °  rotation  is  applied 
if  necessary  to  place  Vk  *n  the  right  half  plane.  So,  the  normalized  Walsh 
points  form  the  vector 

Vk 

_  V  K-i  _  shift  V 
2  Vo  k 

,Vk-t 

•  TC 

or  2  -  <  •  e  2 .  Then  the  Walsh  points  are  calculated  by  computing  the 

Area  vector  and  multiplying  by  2m.  Figure  4.2  depicts  an  F104  contour. 
Figures  4.3  to  4.6  show  the  Walsh  points  with  their  corresponding  t- 
neighborhoods  after  normalization.  Figures  4.7  to  4.11  show  their 
corresponding  x  and  y  projections. 

4.4  Experimental  Results 

The  feature  vector  used  in  the  experiments  was  formed  by  simply  listing 
the  Fourier  points  real  and  imaginary  parts  in  order,  i.e. 

I  =  (x<»yo.xi.yi»  •  *  • .  xK-i.yK-i)T 


For  most  of  the  experiments  the  full  feature  vector  consisted  of  32  Walsh 


-neighborhood 


X  Projection 


Y  Projection 
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X  Projection 


Y  Projection 

‘“1 

H 


Figure  4.9  Normalized  Walsh  points  projections  (2’!’.  32). 


X  Projection 


Y  Projection 

n 


Figure  4.10  Normalized  Walsh  points  projections  (2m  -  84). 
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points.  In  most  cases  the  feature  is  transformed  using  principle  components 
and  then  truncated  to  12  real  numbers. 

Table  4.1  and  Table  4.2  list  the  results  for  the  image  resolution 
experiment.  The  feature  consisted  of  32  Walsh  points  reduced  to  the  12  real 
numbers.  The  best  results  are  obtained  when  the  prototype  and  unknown 
image  resolutions  match. 

Table  4.3  presents  the  results  of  the  feature  vector  experiment.  The 
results  shown  in  the  top  portion  of  the  table  are  for  the  full  feature  vector.  The 
lower  portion  corresponds  to  when  the  feature  was  reduced  to  12  real  numbers. 
The  performance  improves  with  increasing  number  of  Walsh  points. 

Table  4.4  lists  the  results  of  the  imaging  noise  experiment.  The  feature 
set  is  the  12  reduced  real  numbers.  The  performance  improves  with  increasing 
signal-to-noise  ratio. 

Table  4.5  lists  the  results  of  the  library  sampling  experiment.  The 
performance  improves  with  increasing  numbers  of  library  views. 

The  partial  shape  experiment  results  are  contained  in  Table  4.6.  The 
reduced  vector  was  used  here  also.  The  method  can  obtain  a  correct 
classification  3/4  of  the  time  even  with  10%  of  the  contour  chopped. 

4.5  Conclusions 

The  Walsh  points  perform  well.  The  speed  at  which  the  Walsh  points  can 
be  computed  make  them  a  good  competitor.  The  major  problem  is  with  the 
normalization.  Normalization  is  cumbersome  for  the  Walsh  points.  In  fact, 
normalization  is  actually  carried  out  with  use  of  the  Fourier  descriptors  of  the 
boundary.  The  use  of  interpolation  techniques  would  surely  improve  the 
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Table  4.1  WAL  image  resolution  experimental  results:  classification 
accuracy  (%). 


Unknown 

Resolution 

16 

32 

Prototype  Resolution 
64  128 

256 

512 

16 

45.0 

44.67 

40.0 

38.33 

37.33 

35.33 

32 

52.33 

73.33 

71.33 

62.67 

58.67 

57.0 

64 

40.0 

72.33 

83.0 

80.0 

75.67 

72.0 

128 

46.67 

67.67 

82.0 

86.33 

83.33 

82.67 

256 

44.33 

66.0 

81.33 

85.67 

85.33 

83.67 

512 

44.0 

64.33 

70.0 

84.67 

83.33 

85.0 
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Table  4.3  WAL  feature  vector  experimental  results. 


Number  of 

Number  of 

Classification 

Median  Angle 

Walsh 

reduced 

Accuracy 

Error 

Points 

Features 

(%) 

( ° ) 

16 

r  32  ^ 

88.0 

6,4 

32 

64 

00.67 

6,4 

64 

128 

01.67 

6,4 

32 

83.0 

6,4 

64 

1 

83.33 

6,4 

128 

12 

84.0 

6,4 
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Table  4.4  WAL  imaging  noise  experimental  results. 


Library 

Resolution 

Unknown 

Resolution 

SNR 

m 

Classification 

Accuracy 

(%) 

Median  Angle 
Error 

( ° ) 

^  256 

64 

3 

28.0 

25,17  H 

6 

43.33 

14,9 

10 

61.67 

9,6 

20 

oo 

71.33 

75.67 

H 

256 

128 

3 

39.0 

21,12 

6 

52.0 

12,8 

10 

72.0 

7,5 

20 

83.33 

6,4 

oo 

83.33 

6,4 

Table  4.6  WAL  partial  shape  experimental  results. 


Percent 

Contour 

Chopped 

Classification 

Accuracy 

(%) 

Median  Angle 
Error 

A*„A*V  ( ° ) 

0 

r"  83.33 

6,4 

span 

76.0 

7,5 

KB 

56.33 

12,7 

30.0 

27,12 

40 

21.33 

63,27 

50 

17.0 

91,33 

40 

50 
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results.  The  Walsh  points  however  do  not  provide  the  easy  access  to  generic 
shape  property  measurements,  such  as  for  symmetry. 


I' 
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CHAPTER  5 

CUMULATIVE  ANGULAR  DEVIANT 
FOURIER  DESCRIPTOR 


5.1  Introduction 

This  shape  analysis  method  also  uses  the  Fourier  expansion  in  order  to 
exploit  its  time  shift  properties.  Instead  of  approximating  the  boundary 
function,  the  approach  here  is  to  use  a  function  derived  from  the  angle  function 
(ZAHN72).  Physiological  and  psychological  investigations  of  the  human  visual 
system  suggest  that  curvature  is  very  important  in  classifying  shape  [ATTN66], 


The  boundary  is  traced  clockwise  as  a  function  of  arc  length.  At  each 
point  the  instantaneous  direction  of  the  velocity  vector  with  respect  to  an 
orthogonal  coordinate  system  defines  the  angle  function,  0(0),  as  a  function  of 
arc  length.  Since  this  function  repeats  after  one  complete  circuit,  it  is  periodic 
and  can  be  expanded  into  a  complex  Fourier  series.  But  first  it  would  be 
advantageous  to  normalize  this  function.  The  initial  starting  angle  0(0)  =  £0  is 
first  subtracted  to  define 

m  =  ,  0€(o,lj, 

where  L  is  the  total  arc  length.  The  time  scale  is  normalized  to  make  a 
function  that  has  a  period  of  2tt.  Also,  if  the  contour  is  simple  and  closed,  then 
^(0)  =0  and  ^(L)  = -2 ir.  This  linearly  decreasing  angle  is  subtracted  to 
obtain  the  cumulative  angular  deviant  function  ^B(t), 


_ 
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+  1  >  telO,27r]  . 

The  complex  Fourier  series  then  is 

*■(»)  =  S’  d.e1"1 

n=-oo 


where 

2ir 

■ 

2jt  0 

But  since  is  purely  real,  then  dn  =  d_„*.  So,  it  is  only  necessary  to  know  dn 
for  n  >  0.  The  series  can  therefore  also  be  written  as 

<>■(<-)  =  I* o  +  £ Anoint  +  on)  , 

n=l 

where 

P  o  =  d0 

K  =  2|  dn|  and  on  =  Z.dn  . 

5.2  Properties 

The  following  is  a  list  of  properties  of  the  Fourier  series  of  the  cumulative 
angular  deviant  [ZAHN72]  as  they  pertain  to  shape  geometry. 

1)  <t>m  is  invariant  under  translation,  rotation,  and  dilation  (scale)  of  the 
contour  % 


ft 


t 


V 


2)  For  all  simple  closed  curves 

<t>m( 2*)  =  ^"(0)  . 

3)  Since  it  has  been  required  that  ^*(0)  =  0,  then 

00 

d0  =  -2  £Re{dn} 

n  =  l 

or 

00 

00  =  -'EAnC0SK)  • 
n  =  ! 

4)  Reconstruction:  Given  a  contour  described  by  0(0)  and  starting  point  7(0), 
then 

t 

7(0)  =  7(0)  +  /e5#(x>d\  . 

0 


5)  Any  function  ff  (0)  =  0(0 )  mod  2jt  describes  the  same  contour. 

6)  A  curve  is  completely  represented  by  7(0),  60,  L,  and  ^"(t). 

7)  Closure:  The  function  q(0)  is  a  closed  curve,  if  one  of  the  following  is  true: 

a)  A!  is  a  zero  of  the  first  order  Bessel  function  function  Jj(x)  and 
An  =  0  for  n  >  0. 

b)  An  =  0  for  all  n  mod  k  *  0,  where  k  >  2. 

8)  Starting  point:  A  clockwise  shift  in  the  starting  point  of  A1  implies  that 

_i  ZffnAI 

=  V  L 
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K  k  k  + 1 

*(f)  =  EA<^  *  EAi<»< 

i=l  i=l  i=l 


=  0,  for  0  <P  <f, 


With  these  definitions  (see  Figure  5.1),  the  Fourier  coefficients  are 

do  =  ~n~  y'Eh&h 

L  k=i 


•  k  jlfisk 

d„  =  —SA4e  L  ,  n^O, 


“»k=i 


where 

h  =  EA^  • 

i=l 

The  parameters  A0;  and  A<t>\  can  be  obtained  directly  from  the  chain 
codes,  a,,  where  Af;  =  v2  or  1  and 

A^i  =  f(ai  -  a;-,) 


where 


{  (i,f(i))  :  (-7,|),(-«,i),(-5,^),(-4,I),(-3.-|LU-2.-f ), 


(-1,-7),  (0.0),  (1,7),  (2.|),  (3, -y),  (4,-t),  (5, -y),  (0,-7),  (7,-*)}  . 


For  a  contour  having  a  long  chain  code,  this  can  be  a  lengthy  calculation 
of  order  N'K.  This  burden  can  be  reduced  by  resampling  Afj  and  A <f>t  to 
K'  <  K.  This  will  introduce  distortion  which  is  non-linear  with  respect  to  the 


contour. 


Another  alternative  is  to  resample  the  (x-„  y-,)  sequence  to  a  power  of  two. 
Compute  the  cumulative  angular  deviant  function  as 


<t>  *  =  tan 


yry;-i 


-  tan' 


y;-yo 


xrx0 


2ir  . 


Then  use  the  FFT,  retaining  the  positive  frequencies  only.  The  resampling  and 
discrete  approximation  used  to  compute  the  angle  function  introduce 
distortion.  Also,  the  FFT  is  periodic,  so  there  is  aliasing  if  the  resampling  does 
not  effectively  lowpass  the  contour  sequence.  Another  source  of  distortion  is 
that  now  the  samples  are  not  necessarily  uniformly  distributed  along  the 
contour. 

In  the  experiments  discussed  later,  the  direct  method  of  calculation  was 
employed. 


6.4  Normalization 

The  cumulative  angular  deviant  is  already  normalized  for  translation, 
rotation,  and  dilation.  This  leaves  only  a  starting  point  shift  to  normalize. 
The  original  authors  [ZAHN72]  used  algebraic  combinations  of  the  phases  of 
the  coefficients  to  obtain  starting  point  shift  invariance.  It  is  difficult  however 
to  determine  if  this  transformation  has  retained  all  the  necessary  shape 
information. 

The  method  used  in  this  report  normalizes  the  Fourier  descriptors  of  the 
cumulative  angular  deviant,  dn,  by  employing  the  time  shift  property  in  a 
manner  similar  to  that  used  in  the  Fourier  descriptors  of  the  boundary  function 
normalization.  The  positive  components  having  the  first  and  second  largest 


moduli  are  found,  |  dkl|  and  |  dk2| ,  respectively.  The  starting  point  is  shifted 
so  that  the  angle  of  the  kl(h  component  has  zero  phase,  i.e. 

-in 

A  i  -  J  e 
un  un c 

For  the  unknowns,  if  the  tx|dk]1|  <  |  dk2| ,  where  t  =  0.95,  then  a  second 
feature  vector  is  computed  using  the  phase  of  dk2,  i.e. 

-in  — 

A  —  A  k2 

un  un c 

This  was  done  to  prevent  noise  from  causing  the  wrong  component  to  be 
chosen  for  the  normalization.  In  the  recognition  experiments,  both  feature 
vectors  are  compared  to  all  the  library  feature  vectors.  The  unknown  is  labeled 
according  to  the  smallest  distance  for  both  of  the  feature  vectors.  Since 
property  3)  states  that  the  d0  coefficient  is  redundant,  it  is  dropped  from  the 
feature  vector. 


5.5  Generic  Shapes 

Following  are  a  few  generic  shapes  with  their  Fourier  descriptors  of  their 
cumulative  angular  deviant. 


1)  Rectangle:  For  a  rectangle  traced  clockwise  with  the  starting  point  on  the 
upper  right  hand  corner, 


(A!,)  =  L  =  2(a  +  b)  . 


In  that  case  the  coefficients  are 
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d 


n 


1 


a 


i  +  (-i)n 


•  _jrnb_ 

frHb  \  J  2{*  +  b)  pin/2 
2(a  +  b)  6 


n  /  0 


or 


.  irnb 

d2n  =  ±cos(-*f-)e*  +  \  d2n  +  1  =0,  n  =  1,2, 
n  a  +  b 


The  normalized  coefficients  dn  are 


_  1  ,  m lb  v  1  ->  *n  +  ^ 

d»  "  7cos,7Tb|e 


>  u2n 


^2n  +  1  0  1»2, 


In  the  special  case  of  a  square 


d.4n  -  ^  >  d4n  +  j  -  0,  j  -  n  -  1,2,  • 


2)  Isosceles  triangle:  For  an  isosceles  triangle  traced  clockwise 

{A^}  =  {cos  *(~— ■),  2cos_1(— ),  cos~J(  ~)} 
a  a  a 


{AP;}  =  {2b,  a,  a},  L  =2(a  +  b)  . 


The  Fourier  coefficients  are 


_  2i 


_  mt  A 

n  -  —e 

n  it 


;  *nb 
a  +  b 


(tt-cos  *(— ))  •cos(-^t~-)  +  COS  *(— )*(-l)n 
a  a  +  b  a 


In  the  special  case  where  the  triangle  is  equilateral 

d3n  =  ~  ,  d3n+J  =  0,  j  =  1,2  ,  n  =  1,2,  •  •  • 


m 

$ 
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3)  Circle:  A  circle  traced  clockwise  has  an  angle  function  0(0)  =  .  Hence 

L 

<pm  =  0.  — ►  dn  =  0 ,  Vn  . 

The  Fourier  descriptor  magnitudes  of  the  cumulative  angular  deviant  for 
an  F104  contour  are  shown  in  Figure  5.2.  The  contours  reconstructed  from 
increasing  numbers  of  coefficients  are  presented  in  Figure  5.3. 

5.6  Experimental  Results 

The  feature  vector  for  the  Fourier  descriptors  of  the  cumulative  angular 
deviant  were  formed  by  dropping  the  DC  (d0)  component  and  listing  the  real 
and  imaginary  parts  of  the  positive  frequency  components,  i.e. 

f  =  (Re{d1},Im{dI},  •••  , Re{dn_i},Im{dn_i))T  . 

In  most  cases  the  full  feature  vector  was  formed  from  the  first  32  Fourier 
coefficients.  Then  the  principle  components  transformation  was  used  to  reduce 
the  feature  to  12  real  numbers. 

The  image  resolution  experiment  was  performed  using  the  feature  vector 
reduced  using  the  principle  components  transformation  to  12  real  numbers. 
The  results  are  provided  in  Table  5.1  and  Table  5.2.  The  best  results  are 
obtained  when  the  prototype  image  resolution  matches  that  of  the  unknown. 

In  Table  5.3  are  the  results  obtained  for  the  feature  vector  experiment.  It 
is  not  clear  why  the  classification  accuracy  should  decrease  with  an  increasing 
number  of  features. 

Table  5.4  provides  the  results  for  the  imaging  noise  experiment.  In 
general,  the  performance  increases  with  increasing  signal-to-noise  ratio.  There 
is  a  large  increase  in  classification  accuracy  between  the  10  and  20  dB  signal- 


u  *  - 1  *  — -  *- 


4 
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Table  5.2  CAD  image  resolution  experimental  results:  median  angle 
A*„A*y  ( ° ). 


Unknown 

Resolution 

16 

Prototype  Resolution 

32  64  128  256 

512 

16 

22,15 

75,21 

78,21 

67,21 

32 

32,15 

18,0 

22,12 

43,13 

41,12 

36,12 

64 

41,15 

25,12 

23,0 

21,0 

32,0 

32,12 

128 

36,15 

wSm 

17,0 

16,0 

16,0 

256 

34,17 

mmm 

17,0 

17,0 

17,0 

512 

\rn%m 

EH 

16,0 

17,0 

Table  5.3  CAD  feature  vector  experimental  results. 


Number  of 

Number  of 

Classification 

Median  Angle 

Fourier 

reduced 

Accuracy 

Error 

Coeffs. 

Features 

(%) 

M,A*VC) 

4 

6 

r"  48.67 

22,13 

8 

14 

58.67 

16 

30 

56.67 

17,9 

32 

62 

53.67 

17,8 

64 

126 

49.67 

21,9 

16 

12 

53.0 

16,9 

32 

12 

52.67 

16,9 

64 

12 

53.33 

16,9 

to-noise  ratios.  There  seems  to  be  a  threshold  effect  occurring. 

Table  5.5  provides  the  results  of  the  library  sampling  experiment. 
Classification  accuracy  does  not  increase  dramatically  with  an  increasing 
number  of  views  as  would  be  expected. 

The  partial  shape  experiment  was  also  performed.  The  results  are  listed  in 
Table  5.6.  The  performance  quickly  degrades  as  the  contour  is  chopped. 

5.7  Conclusions 

The  performance  of  the  cumulative  angular  deviant  Fourier  descriptors  is 
only  moderate  to  even  poor.  Also  the  need  to  take  the  derivative  and  calculate 
the  inverse  tangent  even  before  a  Fourier  transform  is  taken  increases  the 
computation  complexity.  It  is  also  more  difficult  to  obtain  insight  into  the 
geometric  properties  of  a  shape  because  of  the  nonlinearity  of  the 
transformation.  It  is  difficult  to  understand  how  the  high  classification 
accuracy  results  reported  in  the  literature  were  obtained  with  this  method  in 
view  of  the  facts  as  presented  here. 
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Table  5.6  CAD  partial  shape  experimental  results 


Percent 

Contour 

Chopped 

Classification 
Accuracy 
(%)  ' 

Median  Angle 
Error 

( 6 ) 

0 

52.67 

16,9 

10 

34.67 

28,16 

20 

26.0 

30,11 

30 

26.0 

32,23 

40 

25.33 

68,27 

50 

19.0 

90,40 
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CHAPTER  8 

MOMENTS  OF  THE  SILHOUETTE 


6.1  Introduction 

An  internal  spatial  shape  analysis  method  will  now  be  introduced 
[HU62,TEAG80,DUDA77].  Let  the  image  be  a  two  dimensional  function, 
f(x,y).  If  f(x,y)  is  piecewise  continuous  and  has  bounded  support,  then  the 
sequence,  {Mpq},  of  moments  describes  f(x,y),  where 


Mpq  =  /  /  xpyqf(x,y)dx  dy 


These  moments  are  often  called  the  conventional  or  raw  moments.  The 
moment  Mpq  is  said  to  be  of  order  n  =  p  +  q.  This  characterization  is  the 
projection  of  f(x,y)  on  the  basis  set,  the  monomials,  xpyq.  The  monomials  are 
complete  basis  set  for  the  functions  of  the  class  described  above.  The  set 
however  is  not  orthogonal. 

The  shapes  dealt  with  here  are  silhouette.  So,  let  the  set  of  points 


contained  in  the  object  be  O.  Then 


f(x,y)  = 


1,  (x,y)GO 
0,  (x,y)gO 


Since  the  monomials  are  non-zero  almost  everywhere,  this  is  a  global  shape 
analysis  method. 


6.2  Properties 


Listed  below  are  a  number  of  properties  of  the  moment  set  for  the 
function  f(x,y). 

1)  Translation:  Let  Mpq  be  the  moment  of  order  n  =p  +  q  for  f(x,y). 
Translating  f(x,y)  by  the  shift  (a,  b),  then 

f(x,y)  =  f(x-a,y-b)  -  »<p,  =  M' „,  =  £  £(J)  (J)  -  M„ 

r=0  s=0 

When  (a,  b)  is  chosen  such  that  /i^,  =  0,  the  set  {/ipq}  is  called  the  central 
moments. 

Proof: 

OO  OO 

M'pq  =  /  /  xpyqf(x-a,y-b)  dxdy 

“00-00 

OO  00 

=  /  /  (x  +a)P(y  +  b)qf(x,y)dxdy 

-oo-oo 

OO  0° 

=  /  /  S 

-oo~oor=0 

n 

2)  Dilation  (scale)  :  If  the  extent  of  the  object,  O,  is  increased  by  X,  then 

f(x.y)  =  f(x/X,y/\)  -  M'„,  =  X"+’+JM„. 

Proof: 

00  OO 

M'pq  =  /  /  xpyqf(x/X,  y/X)  dx  dy 


^](g)bq~8y*  f(x,y)dxdy  . 
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=  /  /  (Xx)p (Xy)q f(x , y) X dx  Xdy  . 


3)  Rotation:  If  an  object  is  rotated  through  an  angle  0  about  the  origin,  then 
f  (x ,  y)  =  f(x  cos0  -  y  sin0,  x  sin0  +  y  cos0) 


r=0  s=0 


Proof: 


M'p  =  J  f  xpyq  f(x  cos0-ysin0,  xsin0  +  ycos0)dxdy 


=  /  /  (x  cos0  +  y  sin0)p  (y  cos0-  x  sin0)q  f(x ,  y)  dx  dy 


00  OO 

=  /  /  S(?)(xcos0)P”r(ysin0)r 


-oo-oor=0 


■^j(c)(ycos^)*(x sin0)q_8(-l)q  5  f(x,y)dx  dy 


4)  Reflection:  If  an  object  is  reflected  about  the  y-axis,  then 
f'(x.y)  =  f(~x,y)  -  M'p,  =  (-l)PMpq. 

Similarly,  for  a  reflection  about  the  x-axis, 

r(x,y)  =  f(x,-y)  —  M'p  =  (-l)q Mp. . 


•j#  • 
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6)  Combined  moments:  If  there  are  two  objects  O  and  P  such  that 
O  U  P  =  0,  then  the  moments  of  the  union  of  the  two  sets  is  just  the  sum 
of  the  corresponding  moments,  i.e. 

M  0UP  =  M  0  +  M  p 
mpq  1  ‘pq  ‘pq  * 

In  general,  if  h(x,y)  =  f(x,y)  +  g(x,y),then 

M  h  =  M  f  +  M  8 

This  property  can  make  moments  useful  in  combining  the  disconnected 
regions  of  a  shape  when  noise  causes  segmentation  errors.  Also,  if  there 
were  sufficient  storage,  a  two  dimensional  shape  could  have  its  components 
stored  separately.  Then  this  method  could  be  used  for  partial  shape 
recognition  by  combining  the  moments  of  the  components  until  a  good 
match  is  made. 


6.3  Other  Moments  Sets 

The  theory  of  orthogonal  polynomials  provides  a  host  of  complete  sets  of 
basis  functions.  In  general,  however,  they  do  not  have  all  the  simple 
relationships  for  all  the  transformations  mentioned  early.  Others  [TEAG80, 
REEV81a,  HU62]  have  used  these  alternate  characterizations  to  exploit  some 
particular  property.  Some  of  these  alternatives  are  now  introduced. 


1)  Legendre  momenta :  The  Legendre  polynomials  are  a  complete  orthogonal 
basis  set  on  the  interval  (-1, 1).  The  nth  order  Legendre  polynomial  is 


M«>  =  tv  =  g.t„(2n^fk)!  ,kl, 

j=o  k=o  2n  k!(n  -  k)!(n  —  2k)! 


,n-2k 


The  Legendre  moment  of  order  m  +  n  is 
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K„  =  (2m  +  *f "  +  11  /  /  PJ*)  P»(y)  t(x,y)dx  dy  . 

The  function  f(x,y)  can  be  approximated  by  a  continuous  function  which 
is  a  truncated  series 

f(x,y)  -  S  XN_m  m  Pn_m(x)  P m(y)  . 

m=0 


2) 


Zernike  moments:  The  Zernike  polynomials,  Vnf(x,y),  are  complex  valued 
functions  orthogonal  on  the  unit  circle  x2  +  y2  =  1.  They  are  defined  as 

Vn#(x  ,y)  =  Rnf(r)  e*#  , 


where  n  is  the  order,  n>f  >0  and  n-0  is  even.  The  real  valued  polynomial 
Rnt(0  is  defined  as 


n-t 

R„,(r)  =  SH)S 

8=0 


(a~sI- 


»n-2s 


The  Zernike  moments,  Anf,  of  order  n  is  defined  as 

IT  00 

A„,  =  ^  /J[Vn,M)r  gW)rdrd«, 

"  -irO 


where 


g(r,0)  =  f(r  cosfl,  r  sin0)  . 


A  rotation  of  the  object  by  an  angle  a,  implies 

v  =  • 


A  translation  of  the  object  however  is  difficult. 


The  Zernike  moments  are  related  to  the  central  moments.  The  radial 
polynomials  have  the  form 

Mr)  =  SB„,krk. 

k  =  l 

Then  the  Zernike  moments  are 


*  A*k— 2j  +m,2j+f-m  > 


where  q  =  (k  -  0  )/2. 

If  P(x,y)  is  the  reflection  of  f(x,y)  about  a  line  through  the  origin  at  an 
angle  of  0  with  respect  to  the  y-axis  then 

=(A„, 

3)  Rotational  momenta:  The  rotational  moments  [REEV81a]  are  complex 
valued  and  defined  as 

IT  OO 

Fnf  =  J  Jr"  etf,g(r,0)rdrd0  , 

-7T  0 

where  n  is  the  order  and  n>0  >0  and  n-0  is  even, 

g(r,0)  =  f(rcosfl, rsinfl),  for  r  =  \/x2  +  y2  and  9  =  tan_,(-^-)  . 

The  rotational  moments  can  be  obtained  from  the  conventional  moments 
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Fn,  = 


JL± 

2  * 

£  EH) 

j=0  k=0 


n-0 


2 

i 


Mn-f  + 


n-f  +k-2j,f-k+2j 


For  the  special  case  of  9  =  n,  then 

F„n  =  ^  (  .  )  Mj,  n-j 


j=0  J 

So,  the  nth  order  rotational  moment  is  a  linear  combination  of  the 
conventional  moments  of  order  less  than  and  equal  to  n.  If  the  image  is 
rotated  by  an  angle  0 ,  then 

F'n,  =Fn,eif#. 

If  the  image  is  reflected,  then 

=  F‘., 


If  the  function  has  N-fold  symmetry,  then 

if— m 

Fnf  =  Fni  e  N  ,  m=0, 1,  •  •  • ,  k  — 1 


So, 


F  =  F  e 

1  nn  *  nn  ^ 


.  2ir 
in—  m 
N 


i2irn  tt 


F  Bn*(l  -  e  N1  =0,  m  =0,1,  •••  ,k-l 


4 


<  r  r 
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— ►  Fnn  =  0,  for  n  mod  N  ^  0  . 

This  provides  a  way  to  detect  symmetry.  Because  of  noise,  the  higher 
order  moments  become  less  reliable.  Therefore,  a  test  on  the  magnitude  of 
Fnn  should  have  a  threshold  increasing  with  n. 


6.4  Normalization 

In  order  to  perform  shape  recognition  it  is  often  necessary  to  normalize  the 
shape  for  translation,  scale,  and  rotation.  Translation  normalization  is 
accomplished  by  computing  the  central  moments,  i.e. 


/Jpq 


-it 

--no-n 


q 

Is, 


(-M10rr(-M01rsMr8 . 


Scale  normalization  is  performed  such  that  ftw'  =  1.  Thus, 


Rotation  normalization  is  much  more  difficult.  Many  investigators  have 
resorted  to  the  theory  of  algebraic  invariants  [HU62,  DUDA77].  This  theory 
provides  measures  calculated  from  polynomials  of  the  central  moments  that  are 
invariant  under  linear  transformations  such  as  a  rotation. 


A  popular  alternative  is  to  determine  the  principle  axis  of  the  image. 
Then  rotate  the  shape  by  transforming  its  moments,  to  place  the  principle  axis 
along  the  x-axis  [HU62,REEV81a,REEV81b]. 

Normalize  for  rotation  by  the  angle  a,  where 


tan(2a)  =  2 


P'  n 


P  20  P  02 


This  angle  a  to  rotate  the  object  however  is  ambiguous. 


An  additional  180 


k 


$ 


v 
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rotation  will  also  place  the  principle  axis  along  the  x-axis.  It  therefore  is 
necessary  to  add  the  constraints  that  £20 >£02  an<^  /*3o>0>  where  /ipq  are  the 
rotated  moments.  Shapes  with  N-fold  symmetry  however  have  N  principle 
axes. 

For  the  experiments  carried  out  in  this  report,  the  rotation  normalization 
is  carried  out  with  the  help  of  the  rotational  moments,  Fnn.  The  moments  are 
normalized  for  translation  and  scale.  Then  the  rotational  moments  up  to  the 
5th  order  are  calculated.  The  phase  of  the  nth  order  moment  (  2<n<5)  having 
the  largest  magnitude  is  used  for  rotation  normalization.  The  moments  are 
rotated  so  the  phase  of  the  rotated  nth  order  rotational  moment  is  zero,  i.e.  the 
rotation  angle  is  a  =  -Z.Fnn/n  .  Then  the  rotational  moments  are  calculated 

2tt1c 

again.  There  are  still  n-1  rotations  of  - ,  k  =  1,  •  •  •  ,  n-1,  that  would 

n 

allow  the  nth  order  rotational  moment  to  have  zero  phase.  To  choose  among 
these  possible  rotations,  the  following  criterion  is  maximized  for  the  rotational 
moments  of  the  rotated  moments: 

ERe{Fnn}  . 

n=2 

This  essentially  chooses  the  normalization  that  gives  the  highest  degree  of 
symmetry  about  the  principle  axis. 

As  with  the  other  shape  analysis  methods,  an  additional  feature  vector  is 
provided  for  the  unknown  shape  if  the  magnitude  of  the  next  largest  rotational 
moment  is  greater  than  95%  of  the  largest.  This  rotational  moment  is  used  for 
normalization  in  the  same  way  as  mentioned  above. 


'  i  i  MUfrfai' 
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8.5  Calculation 

The  images  generated  for  the  experiments  are  discrete,  so 
N  N 

f(x,y)  =  £f(xityj)  6(x-xity-yi).  Therefore,  the  double  integral  becomes 

i=ij=i 

MPq  =  S  Exipyiq  fUi,  y;)  • 

i=ij=i 

A  discrete  version  of  Green’s  theorem  has  been  derived  [TANG82]  that  makes 
it  possible  to  compute  the  moments  from  the  chain  coded  boundary  instead  of 
the  discrete  image. 

MPq  =  EFx(xi.y-.)DY(ai-i.ai)  +  xipy»q  Cv(ai-i,  a,)  , 


where  the  (xj,  yj  are  the  sequential  boundary  points  (positive  integers), 

Xi 

Fx(xi»yi)  =  y;q  S*p>  aQd  ai  *s  t*ie  direction  chain  code.  Table  6.1  and  Table  6.2 
i=l 

enumerate  DY  and  Cy.  The  computation  of  the  moments  can  be  speeded  up 

considerably  by  having  a  table  of  functions  for  ip.  For  example, 

i=o 

=  n?jn±  ljj 

i=0  4 


6.6  Reconstruction 

It  is  often  important  to  be  able  to  reconstruct  an  approximation  to  the 
original  intensity  function  f(x,y)  from  the  moments  set.  If  f(x,y)  is  of  the  class 
of  functions  already  stated,  then  its  Fourier  transform  exists 
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OO  CO 

F(u,v)  =  /  /  e  i2,r‘ux  +  v*>f(x,y)dxdy  . 

-■oo-oo 


Since,  f(x,y)  is  at  worst  piecewise  continuous  and  of  bounded  support,  then 
F(u,v)  can  be  expanded  into  a  power  series 

FKv)  =  SE1--,?,-  MjkUJvk, 
j=0k=0  JK! 

where  Mjk  is  the  (j,  k)th  moment  of  f(x,y).  If  the  inverse  Fourier  transform  is 
attempted,  then 

00  00  oo  oo  +  k 

f(x,y)=  /  /  E  E-LJT^Mik^vkei2^x  +  vy)dxdy  . 

-oo-oo  j=0k=0  b 

Since  the  limits  of  integration  and  summation  are  inGnite,  they  cannot  be 
interchanged.  So,  some  other  alternative  must  be  explored. 

The  theory  of  inGnite  dimensional  normed  linear  spaces  provides  the 
necessary  tools.  A  theorem  states  that  the  closest  Gt  to  the  function  on  the 
subspace  formed  by  a  Gnite  set  of  the  basis  functions  is  the  orthogonal 
projection  onto  that  subspace.  Then  the  approximation  g(x,y)  to  f(x,y)  is  just 
the  sum  of  the  reciprocal  basis  functions  for  that  subspace  weighted  by  the 
coefficients  obtained  in  the  projection,  i,e. 

f(x,y)  at  g(x,y)  =  EQi^i(x>y)  - 
ieN 

where 

00  OO 

Qfj  =  /  /  f(x,y)<ft(x,y)dxdy 

-OO-OO 


and  Ot  are  such  that 
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00  00 

J  j>(x,y)0k(x,y)dxdy  =  6a 

-oo-oo 

The  reciprocal  basis  for  the  monomials  {xpyq}  is  very  complicated.  To  avoid 
computing  the  reciprocal  basis,  the  moment  set  for  the  monomials  are 
converted  to  Legendre  moments,  Xmn,  where 

E  Ec„iC„kMjk, 

jsOk=0 

where  Cmj  are  the  Legendre  coefficients.  That  is,  the  Legendre  polynomials  are 

Pm(x)  =  ECmjXj  -  xe[-l,l]. 

j=0 

Since  the  Legendre  polynomials  are  orthogonal,  they  are  their  own  reciprocal 
basis  (except  for  a  multiplicative  normalization  constant.)  So, 

f(x,y)  ~  g(x,y)  =  £  E  xn~m,m  Pn-m(x)Pm(y)  » 

n=0  m=0 

where  g(x,y)  can  also  be  written  as 

g(x-y)  =  E  E  6n-m,mxn_mym 

n=0  m=0 


So,  g(x,y)  is  a  polynomial  whose  moments  up  to  order  N  match  those  of  f(x,y). 
Therefore,  care  must  be  taken  to  make  sure  that  the  moments  are  for  an  object 
which  is  contained  in  such  a  region.  If  not,  the  moments  must  be  suitably 
scaled. 

Another  reconstruction  method  is  based  on  the  discrete  nature  of  the 
(NxN)  images  and  their  transformed  moments  [ROSE76].  Since, 
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N-1N-I 

mp,  =  S  Exipyjqf(xi>yj)  > 

i=0 j=0 


a  matrix  [Mpq]  can  be  written  as 


|MJ  =  [P(p,i)l(t(i,i)llQ(i.q)l , 


where 


j)  = 


1,  ‘f  f(Xi,yj)  =  1 , 
0,  if  f(x„  yj)  =  0 


P(p,,i)=xip  and  Q(j,q)=yjq 


The  transform  can  be  inverted  so  that 


where 


(f(*ij)jNxN  (P  (b  P)lNxn  [^pqlnxn  (Q  (Qfj)lnxN  » 


[P  l(i.P)l  =  (P(P,i)]  1  and  [Q_1(qj)l  =  [QMl  1 


The  reconstruction  is  performed  over  the  region  [-A,  A]x[-A,  A],  So, 
2A  2A 

X;  =  -j^-i  and  y;  =  -^-i,  i  =  ~N/2,  •  •  •  ,N/2.  It  is  important  to  chose  A  so 

that  the  image  described  by  the  moments  is  contained  in  the  region  of 
definition.  Figure  6.1  shows  the  original  silhouette  for  the  letter  ‘E’.  Figure 
6.2  shows  the  reconstruction  of  the  letter  ‘E’  from  its  moments  up  to  15th  order 
using  this  discrete  inverse  method. 


The  letter  ’IT  reconstructed  from  the  15th  order  moments 
the  discrete  inverse  method. 


162 


8.7  Generic  Shapes 

The  following  are  the  exact  formulae  for  the  moments  of  an  ellipse  (circle) 
and  a  rectangle  (square). 

1)  Rectangle:  For  a  rectangle  of  length  ‘a’  and  height  ‘b’  centered  at  the 
origin 

M  =  jl.+  Hfl|l+ 
pq  (p  +  l)(q  +  l)2p+<>+2 

ap+lb<>+i 

- —  ,P  and  q  even 

_  (p+l)(q  +  l)2P+1 

0 ,  p  or  q  odd 

The  rectangle  is  symmetric  about  both  the  x  and  y  axes,  so  notice  that 
Mpq  =  0  for  p  and  q  odd. 

There  is  no  need  to  normalize  for  translation  since  \f01  =  Mi0  =  0.  After 


normalizing  for  scale 


M'  = 

11  pq 


sza. 

(f)  2 

- —  ,  p  and  q  even 

(p  +  l)(q  +  l)2P*x 

0  ,  p  or  q  odd 


For  the  special  case  of  a  square  having  side  of  length  ‘a’, 


M  '  = 

mpq 


(p  +  l)(q  +  l)2p  +  q  p  and  q  even 
0  ,  p  or  q  odd 


The  rotational  moments,  Fnn,  for  a  rectangle  for  n  =  l  to  5  are 


F<x>  -  Moo'  —  1  i  F*!!  —  0  , 


F“  =  i^blb2-*2)-  F“=°’ 


1  b2  a2 

F«  =  i(^  +  ^'  F“  =  0- 


If  a>b  then  F22<0.  So,  the  normalization  procedure  will  rotate  the  rectangle 
— ,  so  the  major  axis  is  along  the  y  axis. 

It 

2)  Ellipse:  For  an  ellipse  with  major  axis  of  length  2a  and  minor  axis  of  2b 
(a>b)  centered  at  the  origin,  the  moments  are 


Mpq  = 


-  (-i)Pl  [i  +  (-M 
2  (p  +q  +  2) 


lP+ib,+iB(<l±I  J>±I)t 


where  B(m,n)  =  ‘■-f-  (GRAD80J.  Again  the  ellipse  is  already 

I  (m  +  n) 

normalized  for  translation.  Now  to  normalize  for  scale, 
X  =  1/^/Moo  =  l/\Arab.  So, 


M  '  = 
1  pq 


(1  +(-l)pl[I  +  (~l)q]  ,  a,  2  B(£±I  2±L) 

p+q+2  [b’  [  2  ’  2  ’  ' 

2  (p  +  q  +  2)  n  2 


In  the  special  case  where  a  =  b,  we  have  a  circle  whose  moments  are 


M  '  = 
lvlpq 


B(jl±i  £±i) 

p+q+2  '  2  ’  2 

2(p  +  q  +  2)  Jr  2 


6.8  Experimental  Results 

The  feature  vector  for  the  experiments  was  formed  by  dropping  the  (0,0), 
(0,1),  and  (1,0)  moments  since  they  are  identical  for  every  normalized  moment 
set.  The  remaining  moments  up  to  the  nth  order  were  then  listed  from  the 
lowest  to  highest  as  follows: 

1  =  (^20-^11'  /‘02»^30«  /*21»  ■■■  >^0n)T  • 

For  most  of  the  experiments  this  full  feature  vector  of  9th  order  moments  is 
used  to  12  real  numbers  using  the  principle  components  transformation. 

The  image  resolution  experiment  was  performed.  The  results  are  provided 
in  Table  6.3  and  Table  6.4.  The  performance  is  low,  but  follows  the  same 
trend  as  the  other  methods  discused  previously. 

The  feature  vector  experiments  results  are  listed  in  Table  6.5.  The 
unusual  fact  to  note  here  is  that  the  performance  degrades  with  an  increasing 
number  of  features.  In  an  attempt  to  provide  insight  into  this  phenomenon,  an 
additional  experiment  was  carried  out.  It  was  observed  that  the  high  order 
moments  were  on  the  order  of  10*  times  smaller  than  the  low  order  moments. 
So,  an  identical  experiment  was  performed  except  the  moments  were  converted 
to  Legendre  moments  before  being  reduced  or  classifled.  The  Legender 
moments  appear  to  have  values  in  a  smaller  dynamic  range.  The  results  of  this 
experiment,  however,  were  nearly  identical  to  the  original  experiment.  An 
another  experiment  was  performed  where  the  eigenvectors  used  in  the  principle 
components  transformation  were  normalized  by  dividing  by  the  eigenvalue 
corresponding  to  that  eigenvector.  The  results  for  this  experiment  were  even 
worse  than  with  no  eigenvalue  normalization. 


i 
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Table  6.3  MOM  image  resolution  experimental  results:  classification 
accuracy  (%). 


Unknown 

Resolution 

16 

32 

Prototype  Resolution 
64  128 

256 

512 

16 

28.67 

21.67 

19.67 

16.33 

16.0 

16.33 

32 

25.0 

37.0 

25.0 

24.0 

25.0 

24.0 

64 

18.33 

31.33 

36.67 

33.67 

29.33 

28.0 

128 

19.0 

29.0 

38.33 

40.67 

37.67 

34.0 

256 

18.33 

25.67 

31.33 

40.0 

39.67 

38.33 

512 

19.0 

22.0 

31.67 

38.0 

40.67 

40.33 
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Table  6.4 


MOM  image  resolution  experimental  results:  median  angle  error 
A0x,A*y  ( * ). 


A 


Table  6.5  MOM  feature  vector  experimental  results. 


Order  of 
Moments 

Number  of 
reduced 
Features 

Classification 

Accuracy 

(%) 

Median  Angie 
Error 

Mr  A  <M°) 

3 

7 

37.67 

26,16 

7 

33 

32,15 

0 

52 

39.33 

39,16 

14 

117 

37.33 

39,15 

7 

41,16 

9 

45,17 

14 

12 

41,17 
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The  imaging  experiment  was  also  carried  out.  The  results  are  presented  in 
Table  6.6.  The  performance  improves  with  signal-to-noise  ratio.  There  is  a 
slight  threshold  effect  between  3  and  6  dB  for  the  64x64  image  resolution 
unknowns.  However,  the  performance  increases  only  slightly  over  the  entire 
range  of  signal-to-noise  ratios. 

Table  6.7  lists  the  results  obtained  when  the  library  sampling  experiment 
was  performed.  Again,  the  performance  improves  with  increasing  library 
density,  but  the  increase  is  minimal. 

Finally,  the  partial  shape  experiment  was  carried  out  with  its  results  listed 
in  Table  6.8.  The  performance  of  the  moments  is  low  to  begin  with,  but  it 
degrades  slowly  with  increasing  amounts  of  the  contour  being  chopped. 

6.0  Conclusions 

The  performance  of  the  moments  of  the  silhouette  is  unexpectedly  very 
poor.  The  fact  that  the  moments  are  used  to  attempt  an  approximation  to  a 
discontinuous  silhouette  may  explain  the  decrease  in  performance  with  even 
higher  order  moments  (  a  Gibb’s  phenomenon.)  But  is  seems  more  likely  that 
crucial  information  is  lost  when  the  raw  or  conventional  moments  are 
computed  using  the  monomials,  xpyq,  with  p  and  q  large.  That  is,  maybe  the 
problem  is  with  the  numerical  precision  of  the  calculations.  It  is  possible  that  a 
nonlinear  combination  of  the  moments  such  as  moments  invariants  would 
provide  the  transformation  necessary  to  further  extract  the  shape  information. 
This  seems  unlikely,  however.  A  great  deal  of  insight  might  be  obtained  by 
improving  the  reconstruction  methods.  It  is  also  hard  to  explain  why  the 
results  for  these  particular  shape  experiments  are  so  much  lower  than  those 
quoted  in  the  literature. 
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Table  6.7  MOM  library  sampling  experimental  results. 


Library 

Views 

Library 

Resolution 

Unknown 

Resolution 

Classification 

Accuracy 

(%) 

Median  Angle 
Error 

( ° ) 

143 

256 

64 

29.33 

58,20 

40 

23.67 

46,21 

0 

26.33 

45,21 

143 

256 

37.67 

45,17 

49 

32.33 

26,15 

9 

27.33 

32,17 

143 

256 

256 

39.67 

28,15 

49 

35.33 

26,15 

9 

27.0 

40,18 
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Table  6.8  MOM  partial  shape  experimental  results. 
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CHAPTER  7 

MOMENTS  OF  THE  BOUNDARY  CURVE 

7.1  Introduction 

If  a  region  in  the  plane  is  simply  connected  and  its  boundary  is  smooth, 
then  this  silhouette  is  completely  specified  by  its  boundary.  Moments  of  the 
boundary  curve  have  been  used  [DUDA77]  before  for  shape  recognition,  but 
there  has  been  a  lack  of  a  theoretical  basis.  Let  all  the  “mass”  of  the  object  be 
concentrated  on  the  boundary.  Let  Tf(x,y)  =  0  be  the  equation  that  describes 
the  boundary  curve.  Then  the  moments  of  the  boundary,  BMpq,  are 

OO  00 

H,  =  /  }xf yi«Mx,y))dx<Jy, 

-00-00 

where  &(')  is  the  generalized  delta-function  [GELFfi4]  .  This  is  a  global  shape 
recognition  method. 


7.2  Normalization 

The  operations  needed  to  normalize  the  moments  of  the  boundary  are 
exactly  the  same  as  for  the  moments  of  the  silhouette  with  the  exception  of 
dilation  (or  scale.)  To  find  the  proper  relationship,  the  moments  of  the  circle 
boundary  will  be  used. 

The  equation  for  a  circle  of  radius  c  centered  at  the  origin  is  x2  +  y2  =  c2, 
or  in  polar  coordinates  p  =  c.  So, 


CSMOUB 


..  .A  . 
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®Mpq  =  //pp+q+l  cosp0  sinq0«(/>-c)dpd0 


BMp<1,  =  |1  |I  ±l-l)-j  ^Jt+L  <l±V+,+, 
2  2  2 


Now,  if  the  circle  is  scaled  so  it  has  a  radius  of  Xc,  then 
2iroo 

=  //pP+<i+>cosPtfsmqd^-Xc)dpd0 
o  o 


=  xp+q+1cp+q+1  /cosPffsin^dfl  . 

o 


This  would  seem  to  imply  that 

BMpq'  =  Xp+q+1  ®Mpq  . 

So,  to  normalize  for  scale  changes,  let  X  =  1/BM00.  When  ^^x.y))  is  written 
as  6{p- c)  for  a  circle,  we  are  assuming  that  the  mass  is  of  unit  density  and 
uniform  mass  distribution.  On  the  other  hand,  if  #f7(x,y))  had  been  written  as 
SipP-c3)  for  a  circle,  it  would  mean  there  is  unit  mass  uniformly  distributed. 

7.S  Generic  Shapes 

Following  are  the  moments  of  the  boundary  curve  (MOMB)  for  the 
rectangle  (square)  and  a  circle. 


1)  Rectangle:  For  a  rectangle  centered  at  the  origin  with  length  ‘a*  and  height 
‘b’  with  major  axis  along  the  x  axis,  the  moments  of  the  boundary  curve 


2P+q  +  l 


r  a>bq 
p  +  1  q+1 


For  example,  BM00  =  2(a  +  b),  BM10  =  BM01  =  0, 
b  _  a2 

M20  —  —  (a  +  3b).  After  normalizing  for  scale,  the  moments  are 
o 


bM  '  = 

mpq 


f-iipih  +  r-nqi 


4P+q+i 


a  +  b  I  apbq 
p  +  1  q  +  1  J(a  +  b)p  +  q+1 


F or  the  special  case  where  a  =  b,  a  square, 


bM  '  = 
mpq 


-mil  +  (-l)ql  P  +  q  +  1 

8p+q  +  l  (p  +  1)  (q  +  1)  * 


2)  Circle:  For  a  circle  of  radius  c  centered  at  the  origin,  the  moments  of  the 
boundary  curve  are 


bM  '  =  cp+q+1 

*pq  c 


fcffllLitlil  B(£±i,  S±I, 
2  2  2 


For  example,  BMoo  =  2rrc,  BM10  =  BMo,  =  0,  and  BM2o  =  jtc3.  After 
normalizing  for  scale,  the  moments  become 


bM  '  = - - - 

Pq  (2jr)p+q+I 


(zliplU+(-l)q1.B(£+3. 

2  2  2 


For  example,  BMoo'  =  1,  BM10'  =  BM0I'  =  0,  and  BM2o'  =  — . 

Sir1 
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7.4  Calculation 

Since  the  images  are  discrete,  the  moments  of  the  boundary  curve  are 
calculated  from  the  (xv  yj)  coordinates  pairs  obtained  from  the  Freeman  chain 
code.  So, 

BMp,  =  - 

i=0 

Then  the  moments  are  normalized  for  translation,  scale,  and  rotation.  The 
translation  and  rotation  normalization  is  the  same  for  the  moments  of  the 
silhouette.  The  scale  normalization  is  carried  out  as  described  in  this  chapter. 

7.6  Experimental  Results 

The  feature  vector  for  the  moments  of  the  boundary  is  formed  in  exactly 
the  same  way  as  for  the  moments  of  the  silhouette.  The  (0,0),  (0,1),  and  (1,0) 
moments  are  dropped  and  the  rest  listed  as  follows: 

X  =  (B/‘20.B/Mi.EW  •  •  •  ,BPon)T  • 

For  most  of  the  experiments,  the  9th  order  moments  were  used  followed  by  the 
principle  components  transformation  retaining  12  real  numbers. 

The  results  of  the  image  resolution  experiment  are  provided  in  Table  7.1 
and  Table  7.2.  The  performance  is  best  when  the  prototype  is  of  a  resolution 
approximately  the  same  as  the  unknown.  The  256x256  library  resolution  has 
the  highest  average  classification  accuracy. 

The  results  of  the  feature  vector  experiment  are  shown  in  Table  7.3.  The 
performance  is  the  same  for  all  the  different  feature  vectors  investigated.  This 
would  imply  that  all  the  shape  information  (though  it  is  not  much)  is  contained 
in  the  3rd  order  moments  or  less.  It  was  observed  that  the  low  order  moments 
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Table  7.1  MOMB  image  resolution  experimental  results:  classification 
accuracy  (%). 
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were  of  many  orders  of  magnitude  larger  than  the  high  order  moments.  Since 
the  square  of  the  error  distance  measure  was  used  in  the  classification,  the 
differences  among  shapes  expressed  in  the  high  order  moments  would  be 
completely  dominated  by  the  low  order  moments.  This  problem  also  occurred 
for  the  moments  of  the  silhouette,  but  to  a  somewhat  lesser  extent. 

The  imaging  noise  experiment  was  also  performed.  Its  results  are  provided 
in  Table  7.4.  As  expected,  the  performance  improves  gradually  with  increasing 
signal-to-noise  ratio. 

The  library  sampling  experiment  results  are  listed  in  Table  7.5.  The 
performance  improves  moderately  with  increasing  number  of  library  views. 

Table  7.6  provides  the  results  for  the  partial  shape  experiment.  The 
performance  degrades  very  slowly  as  more  of  the  contour  is  chopped. 

7.6  Conclusions 

The  overall  performance  of  this  method  is  very  poor.  The  dynamic  range 
problem  indicated  for  the  moments  of  the  silhouette  seems  to  be  accentuated 
for  the  moments  of  the  boundary.  Something  would  have  to  be  done  to 
address  this  problem  before  this  method  would  be  of  much  use  for  shape 
recognition. 
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Library 

Resolution 

Unknown 

Resolution 

SNR 

(dB) 

Classification 

Accuracy 

(%) 

Median  Angle 
Error 

MXA<K  ( ° ) 

256 

64  ^ 

3 

17.33 

30,31 

6 

22.33 

46,36 

10 

26.0 

31,27 

20 

24.67 

39,17 

oo 

26.0 

49,17 

256 

128 

3 

15.67 

47,27 

6 

15.67 

49,32 

10 

22.33 

47,21 

20 

28.33 

29,22 

00 

28.33 

29,22 
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Table  7.5  MOMB  library  sampling  experimental  results.  j 


Library 

Views 

Library 

Resolution 

Unknown 

Resolution 

Classification 
Accuracy 
(%)  ' 

Median  Angle 
Error 

( ° ) 

143 

256 

64 

26.0 

49,17 

49 

21.67 

27,17 

9 

22.0 

46,16 

143 

256 

128 

28.33 

29,22 

49 

25.0 

25,17 

9 

22.33 

46,17 

143 

256 

256 

28.0 

25,16 

49 

• 

24.67 

21,13 

9 

22.33 

46,18 
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Table  7.6  MOMB  partial  shape  experimental  results. 


Percent 

Contour 

Chopped 

Classification 

Accuracy 

(%) 

Median  Angle 
•  Error 

A<t>x,A<t>v  ( ° ) 

0 

28.33 

29,22 

28.67 

23,18 

24.67 

34,26 

■rl 

24.33 

63,26 

40 

18.33 

50,33 

50 

22.33 

59,30 
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CHAPTER  8 

COMPARISON  OF  GLOBAL 
SHAPE  METHODS 
8.1  Introduction 

In  this  chapter  the  shape  methods  are  compared  in  two  ways.  First,  the 
analytic  relationships  between  pairs  of  shape  methods  are  discussed.  Then,  a 
comparison  is  made  based  upon  the  shape  recognition  experiments. 

8.2  Analytic  Comparisons 

The  Fourier  descriptors  of  the  boundary  function  and  the  Walsh  points 
are  very  similar.  The  basis  functions  for  both  methods  are  characterized  by  the 
number  of  zero  crossings  in  one  basic  interval.  The  Fourier  component  of  a 
particular  frequency  has  a  form  very  similar  to  the  corresponding  sequency 
Walsh  function.  A  basic  difference,  however,  is  that  the  Fourier  functions  are 
everywhere  continuous,  while  the  Walsh  functions  have  a  finite  number  of 
discontinuities.  The  discontinuous  nature  of  the  Walsh  functions  is  what 
causes  them  to  lack  a  simple  time  (starting  point)  shift  property  such  as  the 
Fourier  functions  exhibit.  This  difference  is  also  what  often  causes  the  distance 
measure  used  in  analyzing  approximation  errors  for  the  Fourier  and  Walsh 
functions  to  be  different.  Often  times  the  “sup  max”  norm  is  used  for  the 
Walsh  method  and  an  Euclidean  norm  for  the  Fourier  method. 

The  boundary  function  and  cumulative  angular  deviant  are  related 
through  a  nonlinear  transformation.  If  7(t)  is  the  boundary  function,  then  its 


derivative  can  be  written  as 


T(t)  =  |Mt)|  eiL*>. 


If  the  contour  was  traced  uniformly  (i.e.  as  a  function  of  arc  length),  then 

|  ^(t)  |  =  K  =  constant  . 


So, 


7(t)  =  K  e^)  . 


The  angle  function,  0(1),  of  chapter  5  is  then 

lT(t)  =  «(-t)  . 


0(1)  was  defined  for  the  object  traced  clockwise. 

T(t)=Ke  T  T 


Thus, 

,  te[o,T] . 


Hence, 


Si 


27m 

T 


=  Ke 


Determining  the  coefficients  of  -7  from  the  coefficients  of  <t>m  is  similar  to  the 
problem  of  finding  the  spectrum  of  a  phase  modulated  signal  from  the 
spectrum  of  the  modulation  signal. 

Now,  in  discussing  the  relationship  between  the  Fourier  descriptors  of  the 
boundary  and  the  moments  of  the  silhouette,  the  class  of  objects  will  be 
restricted  to  those  which  are  star  shaped  with  respect  to  a  (reference)  point. 
An  object  is  star  shaped  with  respect  to  a  point  if  a  line  segment  joint  that 
point  and  any  other  point  in  the  object  is  completely  contained  in  the  object 
(LAY82).  The  (p,  q)th  moment  of  a  silhouette,  0,  is  defined  as 


186 


OO  OO 

Mp-  =  /  /  xpyqf(x,y)dx  dy 


where 


f(x,y)  = 


1,  (x,y)60 
.0,  (x,y)gO 


Writing  x  and  y  in  polar  coordinates 


-if  +  p-i# 

x  =  p  cosff  =  p  ( - - - ) 


ei#  -  e~  J  4 
y  =  p  sintf  -  p  ( - - - ) 


Expanding  xp  and  yq 

ei#  +  e-'* 


xp  = 


P  (‘ 


=  if  ?  £  O  w'r"  («■*)■ 


2 ' 

*  m=0 


=  (£)p  £  O  'i|p-2m|' 


o '  *-•  vm' 

6  m=0 


= 


/  e>>  +  e_,#  1 
p{  2i  1 


=  {JLy\  ^  (i)2k_^  ei*<,-2k^ 
2  k=o 


Substituting  into  the  moment  equation 


2>r  oof 

Mp,  =  /  /  (f)p  t  O 

0  0  4  m=0 


( A)q  £  (i)2k'<l  ei^_2k^ 
2  k=0 


^p,9)pdpd9 


.  q  ._  . 


=  1 1  Sir  O  <2*  l  I?*"'  ****•*'»  <u , 

m=0  k=0  *  0  0 


where  g (p,0)  =  f(x  cos0,y  sinfl).  Then  for  a  fixed  $ 

1,  P<'*>(0) 

g (p,  0)  =  =  u(p  “  rb(0)) 

0,  />>rb(0) 


rb(0)  is  the  distance  from  the  reference  point  to  the  boundary,  and  u(t)  is  the 
Heaviside  step  function.  Since,  the  object  is  star  shaped,  rb(0)  is  unique  for  a 
fixed  6.  The  double  integral  can  be  written  as 


J  /pP+o  +  1  eilp+<r2(m+,t)l*  u (p  -  rb(0))  dpdO 
0  0 

2ir  r»(*l 

~  J  J  ^,P  +  q  +  l  ei(p+<!+2)#  e-i2(m  +  k  +  I)» 


4bn 

— - /Irb(0)Ip+q+2  e^p+,+2^  e~i2^m+k  +  l^  d 0  . 

p+q  +  2  q  1 


If  the  object  is  traced  so  that  'rfO)  —  rb(0)  e1*,  then  the  above  integral  becomes 


Notice  that 


c"  =  e‘"' ie 


If  we  let  f*nch  to  be  the  ith  element  of  the  nth  order  convolution  of  the  {cn} 
sequence  with  itself,  then 


Notice  that  if  the  boundary  is  traced  so  that  t( 9 )  =  rb(0)  e'f,  then  7  is  not 
uniformly  traced. 

A  more  general  relationship  between  the  moments  of  an  object  O  and  the 
Fourier  descriptors  of  the  boundary  function  7  can  be  obtained  using  Green’s 
theorem.  The  (p,  q)th  moment  of  a  simply  connected  object  is 

00  00 

MPq  =  /  JxPyqf(x,y)  dxdy  • 

-00-00 


Green’s  theorem  states  that 
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xp+,yq .  dy 

xpyq+1  dy 

(p  +  1)  dt 

(q  +  1)  dt 

dt 


x 

p  +  1 


.iL  _ 

dt 


_JL_ 

q  +  1 


dx 

‘dt 


Since, 


*(»)  = 


and  y(t)  = 

£x 


Then 


= 


pq 


=  ij;+5h+i)  M'->  <2>  •[**** "  7*1‘) 


+  (p  +  q+2)(^*  -  'rS))  7p  +  q~m~k  (7*)m  +  k  dt 

If  7(t)  can  be  written  as  a  finite  sum  of  Fourier  components  then,  in  principle, 
this  expression  can  be  evaluated.  In  chapter  3,  the  area  formula  in  terms  of 
the  Fourier  descriptor  coefficients  was  derived  using  this  expression,  i.e.  Area 
=  Moo- 


8.3  Empirical  Comparisons 

In  this  section  the  global  shape  methods  will  be  compared  on  the  basis  of 
the  evidence  provided  by  results  of  the  various  experiments. 

First,  the  image  resolution  experiment  will  be  discussed.  The  diagonal  of 
the  tables  of  results  for  the  image  resolution  experiment  are  plotted  together  in 
Figure  8.1,  Figure  8.2,  and  Figure  8.3.  This  experiment  sets  the  stage  for  the 
rest  of  the  combined  results  in  that  it  clearly  ranks  the  methods  performances. 


190 


Fourier  descriptors  of  the  boundary  (FDS)  and  Walsh  points  (WAL)  have  the 
best  results.  Then  the  cumulative  angular  deviant  Fourier  descriptor  method 
(CAD)  is  next,  but  at  a  significantly  lower  performance.  The  moments  of  the 
silhouette  (MOM)  and  the  moments  of  the  boundary  (MOMB)  have  the  worst 
performance.  The  performances  for  all  the  methods  quickly  deteriorate  below 
the  32x32  image  resolution.  Since,  this  occurs  for  all  the  methods,  it  is  likely 
that  this  is  due  to  the  particular  aircraft  shapes  being  used.  The  median  y 
angle  error  is  usually  somewhat  lower  than  the  median  x  angle  error.  This  is 
also  probably  due  to  the  aircraft  shapes  themselves. 

The  combined  results  for  the  feature  vector  experiment  are  plotted  in 
Figure  8.4,  Figure  8.5,  and  Figure  8.6.  The  results  for  this  experiment  are  the 
most  difficult  to  explain.  The  behavior  of  the  Fourier  descriptors  and  Walsh 
points  are  as  would  be  expected.  As  the  number  of  features  increase,  so  does 
the  classification  accuracy.  The  performance  for  the  cumulative  angular 
deviant  and  moments  of  the  boundary  increase  but  then  decrease.  It  is  difficult 
to  explain  why  this  occurs.  The  moments  of  the  boundary  on  the  other  hand, 
neither  increase  or  decrease  over  the  range  investigated.  This  would  indicate 
that  all  the  information  available  is  contained  in  the  first  few  moment  values. 

The  results  for  the  library  sampling  experiment  are  plotted  in  Figure  8.7, 
Figure  8.8,  and  Figure  8.9.  As  the  number  of  library  views  are  increased  the 
classification  results  improve.  Again,  the  Walsh  points  and  Fourier  descriptors 
of  the  boundary  are  very  close  with  the  other  methods  following  a  similar  trend 
but  with  a  poorer  performance  level. 

The  imaging  noise  experiment  results  are  plotted  in  Figure  8.10,  Figure 
8.11,  and  Figure  8.12.  All  the  methods  seem  to  exhibit  a  threshold  effect.  The 
Fourier  descriptors  and  Walsh  points  effectiveness  degrades  markedly  between 
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Figure  8.1  Combined  results  for  the  image  resolution  experiment: 
classification  accuracy. 
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Figure  8.2  Combined  results  for  the  image  resolution  experiment:  median  x 
angle  error. 


I 

.  — uliaaA 


fledian  V  angle  error  (degr 


Figure  8.3  Combined  results  for  the  image  resolution  experiment:  median  y 
angle  error. 
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Figure  8.4  Combined  results  for  the  feature  vector  experiment: 
classification  accuracy. 
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Combined  results  for  the  feature  vector  experiment:  median  x 
angle  error. 
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Figure  8.6  Combined  results  for  the  feature  vector  experiment:  median  y 
angle  error. 


Cl  ass if i cat  ion  accuracy  <> 


200 


6  and  10  dB  signal- to- noise  ratio.  The  moments  methods  exhibit  a  very  soft 
threshold.  The  cumulative  angular  deviant  Fourier  descriptors  method  exhibits 
a  very  prominent  threshold  between  10  and  20  dB  signal-to-noise  ratio. 

The  results  for  the  partial  shape  experiment  are  plotted  in  Figure  8.13, 
Figure  8.14,  and  Figure  8.15.  The  Fourier  descriptors  and  Walsh  points  have 
similar  characteristics.  The  performance  of  both  of  these  methods  decreases 
rapidly  once  more  than  lO^c  of  the  contour  is  chopped.  Since,  these  are  all 
global  shape  methods,  this  is  an  expected  result.  The  moments  methods, 
however,  degrade  very  slowly.  Their  performance  is  so  poor  to  begin  with,  this 
ability  is  still  not  very  useful.  The  performance  of  the  cumulative  angular 
deviant,  on  the  other  hand,  degrades  immediately  from  its  already  poor  level  as 
the  contour  is  chopped. 


8.4  Conclusions 

The  Fourier  descriptors  of  the  boundary  and  the  Walsh  points  are  very 
similar  methods.  Both  perform  excellently  in  comparison  to  the  other  shape 
methods.  The  cumulative  angular  deviant  is  a  close  relative  of  the  Fourier 
descriptors.  So  its  performance  is  closer  to  that  of  the  Fourier  descriptors  of 
the  boundary.  But  it  is  still  significantly  worse  in  classification  accuracy  with 
respect  to  the  Fourier  descriptors  of  the  boundary. 

The  moments  methods  had  a  bad  performance.  From  the  results 
published  in  the  literature,  this  was  unexpected.  It  is  difficult  to  explain  the 
poor  results,  but  as  noted  earlier,  it  seems  that  the  large  dynamic  range  in  the 
moments  might  cause  this  poor  performance.  Another  possibility  might  be  that 
smooth  polynomials  are  being  asked  to  approximate  a  discontinuous  binary 
function. 
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Figure  8.10  Combined  results  for  the  imaging  noise  experiment:  classification 
accuracy. 
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Figure  8.11 
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Figure  8.12  Combined  results  for  the  imaging  noise  experiment:  median  y 
angle  error. 
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Figure  8.13  Combined  results  for  the  partial  shape  experiment:  classification 
accuracy. 
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Figure  8.14  Combined  results  for  the  partial  shape  experiment:  median  x 
angle  error. 
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Figure  8.15  Combined  results  for  the  partial  shape  experiment:  median  y 
angle  error. 
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CHAPTER  0 

FOURIER-MELLIN  TRANSFORM  TECHNIQUE 
FOR  PARTIAL  SHAPE  RECOGNITION 

0.1  Introduction 

Often  the  shapes  to  be  recognized  are  only  partially  correct.  This  can 
come  about  as  a  result  of  noise,  or  of  errors  in  segmenting  the  object,  or  if  the 
view  of  the  object  is  in  some  way  obscured.  If  this  happens  then  some  part  of 
the  object’s  contour  is  either  missing  or  distorted.  Methods  of  complete  shape 
analysis  do  not  deal  with  this  problem.  Global,  complete  shape  analysis 
methods  rely  on  the  fact  that  shape  information  distributed  over  the  entire 
shape  effects  each  component  of  the  characterization.  So,  when  part  of  the 
shape  is  in  error,  all  of  the  features  are  degraded.  Local,  complete  shape 
methods,  even  though  using  local  information,  usually  utilize  some  global 
information  to  normalize  the  local  features.  Sometimes  this  global  information 
needed  can  be  tacitly  assumed.  For  instance,  the  method  used  by  Gifford 
[GIFF82j  uses  the  distances  from  a  group  of  features  to  normalize  all  the  other 
distance  features.  Partial  shape  analysis  methods  attempt  to  overcome  this 
difficulty.  Partial  methods  extract  local  features  invariant  to  scale,  rotation, 
etc.  and  then  attempt  to  combine  features  when  necessary  to  attempt  a  match 
to  the  corresponding  features  of  a  prototype. 

A  great  deal  of  effort  has  been  exerted  to  apply  syntactic  pattern 
recognition  methods  to  partial  shape  recognition  [FU82].  Grammars  are 
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developed  to  implement  rules  for  the  shape  allowing  for  missing  parts. 
However,  it  becomes  very  difficult  to  generate  such  rules  when  there  are  a  large 
number  of  objects  and  possibly  .many  views  of  each  object.  Algorithms  for 
automatic  generation  of  the  necessary  rules  from  test  data  have  not  been 
developed.  As  an  alternative,  methods  that  try  to  dynamically  "warp”  the 
features  to  provide  a  match  to  the  library  shape  have  been  developed 
[WALL81,GIFF82j.  These  methods  have  had  some  success,  but  the  need  to 
test  many  possible  combinations  of  features  causes  them  to  be  very  time 
consuming.  Also,  there  are  a  number  of  thresholds  and  others  parameters  that 
must  be  determined  a  priori. 

The  method  of  partial  shape  recognition  presented  here  attempts  to 
overcome  these  difficulties  by  using  transform  methods  [GROG83],  The 
essential  character  of  the  method  is  that  it  uses  global  information  to 
determine  how  the  features  are  to  be  aligned.  Then  a  local  comparison  to 
determine  the  degree  of  match  is  performed.  This  global  information  is 
obtained  by  the  use  of  a  shift  and  scale  invariant  Fourier-Mellin  correlation. 

9.2  Curvature  Function 

Because  of  the  physiological  and  psychological  evidence,  most  partial 
shape  methods  use  the  curvature  or  the  angles  associated  with  the  boundary  of 
an  object.  If  the  boundary  function,  ^(t)  =  x(t)  +  iy(t),  is  twice  differentiable 
and  V(t)  /  0,  tG(0,LJ,  then  the  curvature  function,  #c(t),  is 

*c(t)  =  -^-tan-1-^^-  , 

V  '  at  x(t) 

where  x(t)  and  y(t)  are  the  derivatives  of  x  and  y  with  respect  to  t.  Since  the 
boundary  functions  for  discrete  images  are  polygons,  a  discrete  approximation 
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of  the  curvature  [BENN75]  is 


_  ,  -i  y»  -  yi-i  .  xi-i  ■  yi-2 
K-,  -  tan  1 - tan  * - ,  i  -  0,  •  •  •  ,K~1  . 


xi  -  yi-i 


X|-I  xi-2 


The  discrete  nature  of  the  image  and  the  use  of  the  Freeman  chain  code  causes 
the  curvature  to  be  quantized.  Filtering  is  used  to  smooth  the  curvature 
function. 

The  curvature  uniquely  specifies  a  curve  independent  of  translation  and 
rotation.  The  scale  of  the  shape  is  normalized  by  scaling  the  period  of  the 
curvature  function.  So,  the  curvature  function  sequence  is  resampled  to  a  fixed 
number  of  points.  For  a  simple  closed  curve,  the  integral  of  the  curvature  over 
a  period  is  ±2  tt  (360  ).  So,  the  resampling  is  performed  in  such  a  way  to 
preserve  the  total  curvature. 

If  the  shape  was  complete  then  only  a  starting  point  normalization  would 
be  necessary  to  compare  an  unknown  curvature  function  to  a  prototype. 
However,  when  the  contour  is  only  partially  correct,  then  the  contour  length  of 
corresponding  curvature  features  is  also  likely  to  be  different.  So,  a  scaling  and 
shift  of  the  time  axis  of  the  curvature  needs  to  be  determined.  The  Mellin 
transform  has  a  time  scale  property  [ALTE78,BAUD73,ROBB72]  that  is  useful 
in  determining  the  rescaling  of  the  time  axis  of  the  unknown  to  match  the 
prototype. 
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0.3  Mellin  Transform 

The  Mellin  transform  has  been  used  before  to  perform  scale  invariant 
correlation  (CASA76a,CASA76b,CASA77,CASA78).  The  Mellin  transform  is 
defined  as 

00 

MF(s)  =  / F(x)xs-Idx  . 
o 

Evaluating  the  transform  along  the  imaginary  axis, 

00 

Mp(-iu)  =  /F(x)x"iu_,dx  . 

» 

The  Mellin  has  the  following  time  scale  property.  If  G(x)  =  F(ax),  then 
MG(-iu)  =MF(-iu)a~iu  =MF(-iu)e“ittln‘  Thus,  |  MG(-iu)|  =  |  MF(-iu)| .  A 
scaling  of  the  independent  variable  introduces  a  simple  linear  phase  shift  in  the 
transform  domain. 

Since  the  curvature  functions  for  the  boundaries  are  discrete,  a  discrete 
Mellin  transform  is  desired.  The  Mellin  transform  is  related  to  the  Fourier 
transform.  If  x  =  T  el,  then  the  Mellin  transform  becomes 

00 

MF(-iu)  =  T“iu  /  F(Tel)  eHutdt  . 

“00 

M{F(x)}  =  T-iu  f* {F(T e*)}  . 

For  a  discrete  signal,  once  the  exponential  sampling  has  been  completed,  then 
the  Fast  Fourier  transform  (FFT)  is  used  to  compute  the  Fourier  transform. 
There  are  some  difficulties  when  the  sequence  is  exponentially  sampled. 
Resampling  of  the  function  near  the  origin  creates  problems.  Also,  it  is  not 
clear  what  kind  of  resampling  should  be  done.  To  circumvent  these  problems  a 


direct  calculation  of  the  Mellin  transform  has  been  developed  (ZWIC83).  The 
discrete  signal  is  assumed  to  be  constant  over  each  sample  interval  and  the  first 
sample  in  the  sequence  is  zero.  The  Direct  Mellin  Transform  (DMT)  is 

K~1  Au  i, 

MfI-JUj)  -  E  e 

k=0  1Uj 


where 


Ak  =  F(xk)  -  F(xk  +  1)  and  u=  =  ~  \  ,  i  =  1,2,  •  ,N-1 

IN 


0.4  Fourier-Mellin  Technique 

In  order  to  match  an  unknown  curvature  function  for  a  partial  shape  to  a 
prototype,  it  is  first  necessary  to  eliminate  the  effects  of  the  unknown  starting 
point  shift.  This  is  accomplished  by  taking  the  magnitude  of  the  Fourier 
transform.  The  Fourier  transform  has  a  time  shift  property  analogous  to  the 
scale  property  of  the  Mellin  transform.  If  g(t)  =  f(t  — 10)  and  a>0,  then 
F  {g(t)}  =  e  lwt°  r  {*(<-)}•  So,  |  f  {g(t)}|  =  |  Y  {f(t)>  |  .  Taking  the  magnitude 
of  the  Fourier  transform  to  obtain  time  shift  invariance  may  cause  some 
problems.  The  loss  of  information  in  the  phase  may  cause  ambiguity  that  will 
place  a  number  of  curvature  functions  into  the  same  class  [WALT63].  For 
instance,  a  signal  and  its  Hilbert  transform  have  the  same  Fourier  transform 
magnitude. 

The  Mellin  transform  is  used  to  obtain  a  scale  estimate.  Let 


g(t)  =  f(a[t-t0]),(a>0).  Then 
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ir<*w>i  =ifF(^)| , 


where  F(u>)  =  Y  Let 


F»  = 


FH|  ■  ">«.  g*m 

0,  w<0 


|f{g(t)}|.  w  >  o, 

0,  w  <  0  ' 


Now,  taking  the  Mellin  transform 

MG(-iu)  =  —  Mp(-iu)  ,  where 
a 

Mp(-iu)  =  M{F+(u)}  MG(-iu)  =  M{G+M}  . 


So,  the  F ourier  transform  of  Mp  x  MG*  is  a  cross  correlation  whose  peak  is 
shifted  from  the  origin  by  In  (a),  i.e. 

Y  (Mp  x  MG*}  =  CFF(r  -  In  a)  , 


where 

Cpp(r)  =  r{Mp  x  Mf*}  . 

Once  this  scale  is  determined,  the  unknown  is  resampled.  The  peak  of  a 
circular  correlation  of  the  prototype  curvature  sequence  and  the  resampled 
unknown  curvature  sequence  gives  the  shift  necessary  to  align  the  two 
curvature  functions.  Now  a  local  comparison  can  be  made.  The  local 
comparison  used  in  this  report  is  simply  the  pointwise  difference  between  the 
two  curvature  functions  weighted  by  l/cosh(KjP),  where  K;P  is  a  smoothed 
version  of  the  prototype  curvature  function,  i.e. 


m  ni  iV'i  .  1 1  A  ■■Miifii 
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cosh(<cjP) 

The  weighting  function  was  used  to  make  it  easier  for  a  match  to  be  made  even 
when  a  curvature  peak  is  slightly  misaligned.  The  weighted  difference  sequence 
is  then  thresholded  to  determine  the  segments  of  the  shape  that  match  the 
prototype,  i.e. 

d/  >  t  ,  then  the  ith  segment  doesn’t  match. 

d/  <  t  ,  then  the  ith  segment  matches. 

An  overall  measure  of  the  degree  that  the  unknown  matches  the  prototype, 
similar  to  the  distance  used  by  Gifford  [GIFF82],  is  defined  as 

d  =  d'  +  y  <y  -  1) , 

where 

d'  =  Ed/  , 

ieM 

and  M  =  set  of  indices,  i,  where  K;U  matches  kp  and  f  =  the  fraction  of  the 
total  number  of  points  that  match.  It  is  this  distance  measure  that  is  used  to 
classify  the  unknown  shapes  in  the  experiments  described  later. 

9.6  Experimental  Results 

First,  the  steps  of  the  algorithm  will  be  followed  for  a  single  unknown 
shape.  Figure  9.1  depicts  a  F104  prototype  contour  and  an  unknown  contour. 
This  is  the  prototype  with  10%  of  the  original  contour  deleted.  Also  notice 
that  the  starting  point  has  been  changed.  The  deleted  contour  segment  is 
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replaced  with  a  right  angle  segment  which,  in  this  case,  is  longer  than  the 
original  piece  of  contour  it  replaces.  Figure  9.2  is  the  curvature  function  for 
the  prototype  contour  before  filtering.  A  30  point  rectangular  filter  is  then 
applied.  The  resulting  sequence  is  shown  in  Figure  9.3.  This  sequence  is  then 
resampled  from  624  samples  to  256  samples.  (See  Figure  9.4.)  The  unknown 
curvature  function  is  also  computed,  smoothed,  and  resampled  to  256  samples. 
It  is  shown  in  Figure  9.5. 

For  comparative  purposes,  the  unknown  and  prototype  are  correlated. 
(See  Figure  9.6.)  Then  the  unknown  is  shifted  by  the  amount  corresponding  to 
the  peak  in  the  correlation.  From  Figure  9.7  it  is  easy  to  see  the  effect  the 
scale  difference  might  have  on  the  matching  process.  The  simple  difference 
function  for  this  shift  of  the  unknown  curvature  and  the  prototype  is  shown  in 
Figure  9.8.  This  difference  is  large  even  over  segments  where  the  two  shapes 
should  correspond.  This  illustrates  the  necessity  of  determining  the  proper 
time  scale. 

Now  the  FFT  is  used  to  compute  Fourier  transforms.  Then  DC 
component  is  set  to  zero  and  the  magnitude  computed.  The  Fourier  transform 
magnitudes  are  shown  in  Figure  9.9  and  Figure  9.10.  The  Mellin  transform  is 
computed  using  the  direct  method  (DMT).  The  magnitude  of  the  Mellin 
transform  for  both  the  unknown  and  prototype  Fourier  magnitudes  are  shown 
together  in  Figure  9.11.  It  is  clear  that  the  Mellin  transform  magnitudes  of  the 
two  Fourier  magnitudes  are  very  similar.  Now  the  Mellin  transforms  are 
multiplied.  The  cross  correlation  is  obtained  by  taking  a  DFT  of  the  resulting 
sequence.  The  magnitude  of  this  cross  correlation  is  shown  in  Figure  9.12. 
Notice  that  the  peak  has  shifted  from  the  origin  corresponding  to  the  logarithm 
of  the  scale  change.  This  scale  is  used  to  resample  the  unknown.  The  scaled 
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Difference  function  for  prototype  and  shifted  unknown  curvature 
functions. 
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I  Magnitude  of  the  FFT  for  the  prototype  curvature  function. 
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Figure  9.10 


i. 


Magnitude  of  the  FFT  for  the  unknown  curvature  function. 
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Figure  0.17  Weighted  difference  function  (dotted)  with  segmentation  (solid). 
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Figure  9.20 


matching 

non-matching 


Segmented  reconstructed  contour,  10%  chopped. 


Figure  9.21  Segmented  reconstructed  contour,  20%  chopped. 
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Figure  9.22 


Figure  9.23 


Segmented  reconstructed  contour,  30%  chopped. 


Segmented  reconstructed  contour,  40%  chopped. 
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Figure  9.24 


Figure  9.25 


Segmented  reconstructed  contour,  50%  chopped. 
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Figure  9.26 


Figure  9.27 


Segmented  reconstructed  contour,  70%  chopped. 
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Segmented  reconstructed  contour,  80%  chopped. 
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Figure  9.28  Segmented  reconstructed  contour,  00%  chopped. 
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Figure  9.31  B57  overhead  view  chopped  0 %  to  90%. 
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Figure  9.32  Phantom  overhead  view  chopped  0 %  to  90%. 
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Figure  9.35 


» 


Mirage  overhead  view  chopped  0%  to  90%. 


Table  9.1 


Fourier-Mellin  partial  shape  method  classification  results  for 
overhead  views. 


Percent 

Chopped 

_ 


Number  Correctly 
Classified  (out  of  six) 
6 
6 
5 


0.8  Conclusions 


This  transform  method  attempts  to  overcome  the  many  problems 
previously  encountered  in  performing  partial  shape  recognition.  A  large 
classification  experiment  would  be  needed  to  verify  its  capabilities.  The 
method  appears  to  perform  well  with  up  to  50%  of  the  contour  chopped. 
There  are  several  important  questions  concerning  the  use  of  this  method  that 
have  yet  to  be  addressed.  Some  of  these  are  a)  How  many  Fourier  and  Mellin 
components  are  necessary  to  still  obtain  adequate  results?  b)  What  are  the 
computation  and  storage  tradeoffs  with  respect  to  the  other  methods  available? 
c)  What  information  is  lost  in  taking  the  modulus  of  the  Fourier  transform  of 
the  curvature  function?  d)  What  type  of  error  norm  should  be  used  to 
determine  a  match?  Once  the  scale  and  starting  point  alignment  has  taken 
place  using  the  Fourier-Mellin  transform  technique,  then  some  of  the  other 
matching  techniques,  such  as  dynamic  time  warping,  might  be  used  to  improve 
the  classification  results.  The  introduction  of  this  method  provides  the 
possibility  of  using  transform  methods  to  partial  shape  recognition. 


CHAPTER  10 

CONCLUSIONS  AND  RECOMMENDATIONS 


The  global  shape  methods  using  functional  approximation  have  been 
investigated  in  several  ways.  The  accumulation  of  the  various  properties 
related  to  the  geometry  of  a  shape  have  helped  to  indicate  how  the  methods 
construct  a  characterization  of  the  shape.  The  set  of  shape  recognition 
experiments  provided  empirical  results  useful  in  determining  the  effectiveness  of 
the  various  shape  methods  in  performing  shape  recognition. 

The  Fourier  descriptors  of  the  boundary  performed  better  than  any  of  the 
other  four  methods.  Walsh  points  of  the  boundary  performed  almost  as  well 
and  can  be  computed  quickly.  The  Walsh  points,  however,  do  not  have  the 
nice  properties  of  the  Fourier  descriptors  of  the  boundary.  The  performance  of 
the  Fourier  descriptors  of  the  cumulative  angular  deviant  was  significantly 
below  that  of  the  Fourier  descriptors  of  the  boundary.  The  performances  of 
the  two  moment  methods  were  rather  poor,  far  below  that  of  the  other  three 
methods  just  mentioned.  There  is  a  numerical  precision  or  dynamic  range 
problem  in  using  the  moments  methods  that  has  yet  to  be  overcome. 
Obtaining  the  moments  of  the  silhouette  is  really  a  two-dimensional  transform. 
Much  of  the  information  contained  in  the  moment  set  is  necessary  to 
approximate  the  binary  nature  of  the  silhouette  rather  than  extracting  the 
shape  characteristics.  The  moments,  however,  can  be  used  when  there  is  more 
than  one  connected  component  to  consider  as  a  single  shape. 


Another  important  issue  is  the  representation  problem  for  a  three- 
dimensional  rigid  body.  The  questions  of  how  many  and  which  views  are 
adequate  to  represent  the  shape  have  yet  to  be  answered.  Some  empirical  data 
for  some  complex  objects  (aircraft)  has  been  presented  by  Charpentier  and 
Glenn  (CHAR81,GLEN82).  In  general,  the  theory  necessary  to  solve  these 
problems  is  not  available.  However,  it  has  been  determined  that  two  non¬ 
parallel  views  are  adequate  to  represent  a  quadric  surface  (GROS83).  A  related 
problem  is  the  following:  How  •  many  different  directions  are  necessary  to 
illuminate  the  entire  boundary  of  a  convex  body  with  a  bundle  of  (parallel) 
rays?  It  has  been  asserted  that  8  or  less  different  views  are  needed  to 
illuminate  the  surface  of  a  three-dimensional  convex  body  [BOLT80].  This 
illumination  of  the  surface  is  similar  to  the  operation  of  producing  a  two- 
dimensional  range  image  of  a  three-dimensional  convex  object.  It  is  also  known 
that  the  intersection  of  the  back  projections  of  the  simply  connected  silhouettes 
forms  a  convex  hull  for  the  object. 

Another  possible  way  to  formulate  this  problem  is  with  the  use  of  the 
Radon  transform  [GELF6B].  Let  I^(x'  ,y' )  be  the  Radon  transform  of  the 
three-dimensional  rigid  body,  O,  with  a  constant  density,  p,  for  the  parallel 
bundle  of  rays  with  a  direction  The  silhouette  is  then  the  thresholded 

image, 

V(x'»y')  =  g(i#*(x\y')) 

where  g(t)  =  1,  if  t>0  and  g(t)  =  0,  if  t<0.  This  thresholded  image  is  then 
used  to  reconstruct  the  object.  If  the  thresholding  had  not  taken  place,  this 
would  be  ordinary  reconstruction  from  projections.  So,  the  question  becomes 
this:  Under  what  conditions  can  the  function  f(x,y,z)  be  approximated  to  an 
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acceptable  degree  of  accuracy  when  reconstructed  from  a  limited  number  of 
thresholded  views  given  the  additional  constraints  that  the  object  is  of  bounded 
support  and  is  of  constant  density? 

The  representation  problem  is  important  for  practical  reasons,  since 
having  the  smallest  possible  number  of  library  views  would  reduce  the  storage 
requirements  and  also  reduce  the  time  necessary  to  determine  a  match. 

The  Fourier-Mellin  technique  introduced  has  the  potential  of  solving  some 
of  the  problems  that  make  it  difficult  to  recognize  partial  shapes.  This  method 
was  shown  to  be  capable  of  recognizing  shapes  with  up  to  50%  of  the  contour 
in  error.  A  larger  partial  shape  recognition  experiment  would  help  verify  the 
method’s  usefulness.  The  technique  of  time  shift  and  scale  invariant 
correlation  should  have  applications  in  many  areas  where  part  of  a  signal  is 
seriously  degraded  but  a  match  to  a  prototypical  signal  is  desired.  More 
research  is  also  needed  to  determine  the  best  segmentation  procedure  and 
distance  function  for  the  matching  process.  More  investigations  are  also  needed 
to  determine  how  many  Fourier  and  Mellin  components  are  actually  necessary 
to  obtain  an  adequate  level  of  performance. 
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